<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" 
      xmlns:thr="http://purl.org/syndication/thread/1.0">
  <link rel="alternate" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html" />
  <link rel="self" type="application/atom+xml" href="http://www.insideria.com/atom.xml" />
  <id>tag:www.insideria.com,2009://34/tag:www.insideria.com,2008://34.26336-</id>
  <updated>2009-11-05T20:01:32Z</updated>
  <title>Comments for The &quot;One Million Records&quot; Challenge (http://www.insideria.com/2008/08/the-one-million-records-challe.html)</title>
  <generator uri="http://www.sixapart.com/movabletype/">Movable Type 4.21-en</generator>
  <entry>
    <id>tag:www.insideria.com,2008://34.26336</id>
    <link rel="alternate" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html" />
    <link rel="service.edit" type="application/atom+xml" href="http://blogs.oreilly.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=34/entry_id=26336" title="The &quot;One Million Records&quot; Challenge" />
    <published>2008-08-07T15:15:00Z</published>
    <updated>2008-08-07T15:08:17Z</updated>
    <title>The &quot;One Million Records&quot; Challenge</title>
    <summary>SAP&apos;s Web Dynpro can easily handle loading 1 million records into a visual table and sorting them - so too can Curl.  Can your RIA platform of choice handle 1 million rows in a table?</summary>
    <author>
      <name>Richard Monson-Haefel</name>
      <uri>http://www.curl.com</uri>
    </author>
    
    <category term="Blogs" />
    
    <content type="html" xml:lang="en" xml:base="http://www.insideria.com/">
      <![CDATA[<p>Yesterday I had the fortune of reading a very interesting <a href="https://www.sdn.sap.com/irj/sdn/weblogs?blog=/pub/wlg/10552">blog post</a> by Thomas Jung showing how <a href="http://help.sap.com/saphelp_nw04/helpdata/EN/a5/1a1e3e7181b60ae10000000a114084/content.htm">SAP's Web Dynpro ABAP</a> can load and sort as many as 1 million records!  It's a cool demonstration of the power of that platform. I'm not sure if Web Dynpro is a RIA platform or a fat client - it works with web browser and has a desktop runtime so perhaps its a "fit client" like <a href="http://www.curl.com">Curl</a> or <a href="http://www.adobe.com/products/air/">Adobe AIR</a>.  Thomas  was apparently inspired by <a href="http://www.redmonk.com/cote/2008/07/03/episode-17-curls-richard-monson-haefel-ria-middleware-search-for-flash/">Episode #17 of RIA Weekly</a> in which I discussed the need for enterprise RIA platforms to be able to handle hundreds of thousands of records.  He wanted to see how Web Dynpro stood up to that challenge and from what I can tell it did really well. </p>

<p>I decided to mess around with Curl to see how much data it could really handle. In a fit of curiosity I implemented some benchmark tests that are so unscientific they would make Descartes roll over in his grave.  I ran the tests on my MacBook Pro on Windows XP through VMWare Fusion.  Yes you read that right - not exactly the most preformant configuration in the world. Despite my less than stellar runtime enviroment I was really pleased with Curl's performance.</p>

<p>First I loaded up 10,000 records into a table. The table was a CSV file.  Each record represents a contact and has five columns (i.e. full name, email, work-phone, home-phone).  To get to 10k records I wrote a Java program that generates the records using UUIDs that are converted into strings for values of each field. I trimmed the string values to what I thought were appropriate sizes for various fields (e.g. 20 characters for name, 13 for phone numbers).  You can get the Java program <a href="http://www.monson-haefel.com/insideria/ContactGenerator.java">here</a>.</p>

<p>I ran an example PIM program that comes with Curl which was already able to read a CSV file and load it into a Curl visual table widget.  The Curl application launched and loaded the <strong>10,000</strong> records quickly. It was able to sort on any of the fields - none of which are indexed - instantly.  Scrolling through the records was also instant - no delays at all.  </p>

<p>Next I generated a CSV file with <strong>100,000 </strong>rows of random records. Curl launched and loaded these records in about 4 seconds. It was able to sort on any column in about 2 seconds and provided instant scrolling through all 100,000 records.</p>

<p>Next I generated a CSV file with <strong>1 million</strong> records.  Curl took about 2 minutes to launch and load and about 25 seconds to sort on any given column. Surprisingly scanning through the 1 million records by moving or clicking on the vertical scroll bar was instantaneous.  I have no idea why you can scroll through 1 million records with no latency but sorting them takes 25 seconds.</p>

<p>Finally I met my match when I generated a CSV file with <strong>10 million</strong> records. The Curl application never launched thus ending my experiment - I actually fell a sleep waiting for the Curl application to finish loading, which it never did.   I'm not sure what part of the system got hung up. It may have been Curl, or the Firefox browser, or the funky MacPro/Windows XP/VMware Fusion configuration I'm using.  It's a probably a combination of all of these things.</p>

<p>I would love to see how other people's favorite platforms handle 1 million or even 10 million records. It looks like Web Dynapro is the current leader but I think Curl may have given it a better run for its money on a faster configuration. How many records can your favorite RIA platform handle?</p>]]>
      
    </content>
  </entry>

  <entry>
    <id>tag:www.insideria.com,2008://34.26336-comment:2020461</id>
    <thr:in-reply-to ref="tag:www.insideria.com,2008://34.26336" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html"/>
    <link rel="alternate" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html#comment-2020461" />
    <title>Comment from Thomas Jung on 2008-08-07</title>
    <author>
        <name>Thomas Jung</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>First let me say that I really enjoyed your guest spot on RIA Weekly. I totally agree that moving RIA technology from the consumer to the enterprise space means that the technology will be faced with a different type of scaling challenge and it is going to be interesting to see what kinds of adaptations will take place.</p>

<p>I enjoy doing these kinds of extreme architectural experiments because it does make you think differently as a programmer and I love giving server administrators heart attacks. </p>

<p>So I was again inspired and wanted to give 10million rows a try.  It resulted in a crazy big session state (not surprisingly) but did complete.<br />
<a href="http://flickr.com/photos/tjung/2743001410/"><a href="http://flickr.com/photos/tjung/2743001410/">http://flickr.com/photos/tjung/2743001410/</a></a></p>

<p>I have the unfair advantage of a smallish server running my DB and App. server though. I'm pretty impressed that Curl was able to chew through a 1 million row CSV given the hardware you had it running on.   </p>

<p>As to your question about Web Dynpro being RIA, Fat, or Fit client - it is closest to the last one.  Web Dynpro produces a metadata based description of the UI and then different rendering engines can be applied to the same application at runtime.  The video from my blog is the AJAX/DHTML rendering engine - but we also have an Adobe Flex based rendering engine and a .Net Desktop Client based rendering engine.</p>]]>
    </content>
    <published>2008-08-08T01:12:24Z</published>
  </entry>

  <entry>
    <id>tag:www.insideria.com,2008://34.26336-comment:2020473</id>
    <thr:in-reply-to ref="tag:www.insideria.com,2008://34.26336" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html"/>
    <link rel="alternate" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html#comment-2020473" />
    <title>Comment from Valentin on 2008-08-08</title>
    <author>
        <name>Valentin</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>I have been playing with the Microsoft Grid for Silverlight and with the support for data and ui vitualization 1m records are displayed with no problem. Actually - you can display 15m records - it doesn't matter because of the virtualization support.</p>]]>
    </content>
    <published>2008-08-08T07:31:27Z</published>
  </entry>

  <entry>
    <id>tag:www.insideria.com,2008://34.26336-comment:2020525</id>
    <thr:in-reply-to ref="tag:www.insideria.com,2008://34.26336" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html"/>
    <link rel="alternate" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html#comment-2020525" />
    <title>Comment from Andrii Olefirenko on 2008-08-08</title>
    <author>
        <name>Andrii Olefirenko</name>
        <uri>http://v2.idubee.com/IdubeeClient.html</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://v2.idubee.com/IdubeeClient.html">
        <![CDATA[<p>hi<br />
in Adobe Flex, i've created a component that could handle virtually unlimited number of rows and columns. (limited only by biggest int value possible and physical memory)<br />
<a href="http://v2.idubee.com/IdubeeClient.html">http://v2.idubee.com/IdubeeClient.html</a><br />
Press CTRL+m to insert million records<br />
One cell takes only 8 bytes, and render time is O(1) (constant) for any number of cells.</p>

<p>Standard Flex DataGrid component also quite fast.</p>

<p>I worked with WebDynpro sometime ago. As I remember, it used some kind of ActiveX, so it's pretty much obese :) <br />
</p>]]>
    </content>
    <published>2008-08-09T04:01:19Z</published>
  </entry>

  <entry>
    <id>tag:www.insideria.com,2008://34.26336-comment:2020539</id>
    <thr:in-reply-to ref="tag:www.insideria.com,2008://34.26336" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html"/>
    <link rel="alternate" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html#comment-2020539" />
    <title>Comment from Thomas Jung on 2008-08-09</title>
    <author>
        <name>Thomas Jung</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>>I worked with WebDynpro sometime ago. As I remember, it used some kind of ActiveX, so it's pretty much obese :) </p>

<p>Web Dynpro's main UI libraries don't use any ActiveX.  The UI elements are implemented in AJAX and DHTML. We do have a couple of specialized controls - for example Microsoft Excel Integration and multiple file uploads - that use either ActiveX or Java Applets.  These are pretty specific use cases however. Even the Adobe Interactive Forms elements have been changed from ActiveX to JavaScript based. The future direction for UI elements that can't be accomplished is JavaScript/HTML is to use the Islands technology that allows us to embed Flash/Flex and Silverlight directly into Web Dynpro. </p>]]>
    </content>
    <published>2008-08-09T13:13:33Z</published>
  </entry>

  <entry>
    <id>tag:www.insideria.com,2008://34.26336-comment:2020570</id>
    <thr:in-reply-to ref="tag:www.insideria.com,2008://34.26336" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html"/>
    <link rel="alternate" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html#comment-2020570" />
    <title>Comment from Charles Kendrick on 2008-08-10</title>
    <author>
        <name>Charles Kendrick</name>
        <uri>http://www.smartclient.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.smartclient.com">
        <![CDATA[<p>We've had "stress tests" like these for SmartClient for several years.  Trees are much harder by the way, try those sometime.  Or very large numbers of columns (500 or more) - this requires bidirectional incremental rendering to scale.  Or data cubes.</p>

<p>It's really hilarious to me to see plugin vendors make so much noise about performance and then be matched or surpassed by Ajax.</p>

<p>Ajax, Flash, Java and Curl scale to 1M+ rows.  Now can we stop talking about naive benchmarks like this, because it would be foolish to load this much data up front in the types of applications these platforms target.</p>

<p>Real-world performance in a WAN environment comes from really intelligent data management, which reduces server round-trips, thereby reducing network traffic and database hits.</p>

<p>If you care about performance, evaluate a platform's data binding architecture and it's facilities for re-using already loaded data, managing caches, and performing client-side operations where possible.  If you could benchmark, say, the Cairngorm architecture in Flex against SmartClient's (Ajax-based) databinding architecture, you'd find very large performance differences that actually affect real applications.</p>]]>
    </content>
    <published>2008-08-11T00:26:48Z</published>
  </entry>

  <entry>
    <id>tag:www.insideria.com,2008://34.26336-comment:2020587</id>
    <thr:in-reply-to ref="tag:www.insideria.com,2008://34.26336" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html"/>
    <link rel="alternate" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html#comment-2020587" />
    <title>Comment from Richard Monson-Haefel on 2008-08-11</title>
    <author>
        <name>Richard Monson-Haefel</name>
        <uri>http://www.curl.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.curl.com">
        <![CDATA[<p>Hi Charles,</p>

<p>Thanks for commenting on the blog. I would like to see a demonstration of SmartClient handling 1 million records. Can you post an example?  You say that Ajax surpasses the plug-ins - I would like to see an actual demonstration.</p>

<p>All the best,</p>

<p>Richard</p>]]>
    </content>
    <published>2008-08-11T14:29:35Z</published>
  </entry>

  <entry>
    <id>tag:www.insideria.com,2008://34.26336-comment:2042613</id>
    <thr:in-reply-to ref="tag:www.insideria.com,2008://34.26336" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html"/>
    <link rel="alternate" type="text/html" href="http://www.insideria.com/2008/08/the-one-million-records-challe.html#comment-2042613" />
    <title>Comment from Jake on 2008-09-12</title>
    <author>
        <name>Jake</name>
        <uri>http://www.viblend.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.viblend.com">
        <![CDATA[<p>Handling millions of records is complicating the internal design. It also implies the use of virtual mode which makes it a little bit harder for some beginner developers. However, if you think more on how to solve it, it's becomes almost like a standard engineering problem. There are some memory implications because sometimes you need to store more metadata. In terms of CPU performance, once everything is loaded, ends users may notice a little or no difference.</p>]]>
    </content>
    <published>2008-09-12T17:11:31Z</published>
  </entry>

</feed
