324 x 9GB disks for 2.9TB? You could store 1.2PB with the same number of 4TB disks. Doesn't seem like Moore's Law is holding up for storage, nor for internet speeds.
324 disks takes up most of a row. When we had to do service maintenance in Canyon Park (fun with NT 3.51 and 4.0, before the days of Terminal Server) I'd walk over to TerraServer and admire it's enormity.
Not any more. Using high density disk enclosures, you can fit that many disks in a third of a rack, and if each disk held 4.5TB you'd have almost 1.5PB of storage!
Just under 20 years ago when I got started in this racket, 30G was consider a huge database, cutting edge stuff, requiring experts and special techniques. Nowadays a serious DBA doesn't blink at 30T.
They would be leaps and bounds faster. Especially comparing random IOPS.
Let's just assume they were using a Seagate Cheetah 9LP. That's a 9GB, 10k RPM, 1MB cache UW SCSI disk that was widely available in 1998, for a whopping... $1200 each. $390,000 worth of disks... wow. That could buy 1300 * 4TB SAS drives today with a total raw storage of 5.2PB!
324 * 9LP's would have around 5.5GB/sec of total read/write throughput. That easily beats a single 4TB disk with around 200MB/sec read/write performance and terrible IOPS in comparison to even a single 9LP.
SSDs compare much more favorably here. 10 x 1TB SSDs for ~$6500 could easily reach 5.5GB/sec throughput and triple the capacity of the 324 x 9GB drives.
You can't just compare storage on the capacity, you also have to account for speed. Even these days, 149GB 15k RPM SAS drives are popular for non-SSD usage. Long way from 4TB, but they outperform them any day when it comes to semi-random usage.
I wonder how much of that usage came from people doing the MSDN tutorials. Those were the golden days of MSDN - I remember learning about factoradics and permutations from my 10 CD MSDN library. Microsoft was doing great stuff back then.
I am thinking to do a back-of-the-envelope orders-of-the-magnitude analysis for a setup similar to this for an Amazon AWS EC2/S3/RDS/foo/bar/baz based system.
This was so much fun to use. I built perl scripts around this (learning at the same time differences between lat/lon and UTM) to map wireless access points[0][1].
I can see if there's anything in any of my backups. Nowadays I'd just use Google Earth's history feature (optionally feeding in time based KMLs of access points). I think the WiGLE interface supports time filtering, but I don't know if that's officially exposed.
Thanks! It probably isn't exposed, but it would be nice to see these grow over the years. It may even expose other things like how community hubs have shifted and what demographics are being catered to. Overlays from census data or other insights might reveal some interesting things.
The company I worked for back ended some of these datasets. Definitely a bunch of usgs imagery. At the time several racks of solaris servers were required to serve up the imagery using tomcat for the requests. Those machines were in canada. Windows was used for arcgis and oracle. Shortly after opteron was released and ibm came out with a 64bit jvm I moved the whole thing except for the esri portion over to a simple load balancer running apache and two mirrored dual opterons running fedora 64bit (2002). A total half rack which handled about a million image reqs a day or so. Each req involved decomposing, reading and stitching image mosaics together and sending out in under 10s. Unfortunately we were back ending more than just terraserver so I don't remember who had what specs. With all the memory alloc/dealloc going on inside the java process we were fortunate if it didn't require a full restart every 12 hours.
The funniest part of the whole thing was that I rewrote the engine from java over to a c++ stitcher and got request times under 1s on a dual p3 600. That was bypassing the esri server, oracle database and java stuff. At that time the umn mapserver started getting interesting as well although their image processing was glacial at the beginning. And then I was retasked and I left before being able to finish up and deploy.
Wait, so you're saying that the Terraserver project that was supposed to show Windows and SQL Server capabilities was back-ended by a third party company using tomcat, oracle, java and linux. That doesn't make any sense. The OP-linked paper describes how it was all done with NT, IIS and SQL Server. http://msdn.microsoft.com/en-us/library/aa226316%28v=sql.70%...
Good quality 720p should definitely be more than a gig for movies. Even high quality 480p rips came in at over a gig for most movies. I'm sure the torrents are watchable, but there are sacrifices being made somewhere.
Funny how things change a bit. EMC now owns Clarion. DEC no longer exists. EMC now owns Legato Nerworker. Seagate's backup management group was bought by Veritas, who then merged with Symantec. Such is the potted history of storage companies!
I don't think it's wrong to criticize them for missing this as an opportunity.
I remember playing with it at the time, it was really revolutionary. There was literally nothing like this available to the general public before this.
And who owns mapping these days? Google. It's pretty sad that Microsoft wasn't able to turn a six year head start into a reasonable business.
That's sort of been the story of Microsoft for the past 15 years or so, lots of cool ideas, then a failure to execute and make a great product out of them. I wonder if young engineers these days realize how formidable Microsoft used to be. There was a time when having MSFT in your rearview mirror meant you were in deep shit. Now, they're just another big IT company.
You think that's sad? MSN Messenger had millions of users, complete with their "friends". And they failed to turn that into any social network worth anything. Duh.
Microsoft has a culture of being first of the market, though not always with the best product. Products like Windows 98 and IE4 were buggy, but so far ahead of their time that they had a crushing impact and it took the rest of the world years to catch up. Nowadays competitors with better solutions are usually not far behind. Google and Apple are rarely first to the market, but usually the best. It's a more effective strategy nowadays.
Amazon is one of the few companies that still manages to be first to the market and be successful (Kindle, AWS). Perhaps because many of the old Microsofties went there.
I really don't think either Windows 98 or IE4 were "first to market". OS/2 and Netscape were available and better at the time.
Microsoft's skill was other aspects - really building the whole products that businesses needed, and building a complex channel of services companies for delivery of installation, customisation and training.
My hunch is that's still their advantage, above say the Google Apps team, and it'll come back into play as quick wins of Internet and mobile wear off and it needs deep integration into businesses.
I was doing webdev in the 90s when Netscape Navigator was king.
I have always believed that Microsoft won the browser war, less because of integration with Windows and more because IE4 was so much better. It was generations ahead of Netscape and they focused on making a canvas for developers (excuse the HTML5 pun).
Lots of the 'Microsoft ignores standards' crowd were Netscape centric people who did not like that with IE4 Microsoft tried lots of new stuff but in hindsight this forward push propelled the idea of the web as a platform for development.
Most people using the internet today never used IE4, they were not there and have no clue how big a deal it was.
Reading "..combining five terabytes of image data.." and "Figure 1. The TerraServer hardware" while looking at the portable 2TB external HDD sitting on my desk... how technology advances rapidly. Makes me wonder how (storage) technology can be in 10-15 years.
Yet our smartphones can't seem to get over 64GB limit. Personally, I'm afraid that the cloud will effectively slow-down the progress in the storage space, at least in the consumer portion of it.
SD cards are available with up to 256GB of storage. They just cost $400 or so[0], so I don't think enough people see the value of paying for that much storage in a phone.
The flash memory chips used in phones are essentially the same as those used in SD cards, and phones often use SDIO for internal communication. The SD card is just a point of reference for the cost and availablity of flash memory unbundled from a phone.
There's space inside for more storage in most phones. 64GB is just a point where the number of people who would actually use that amount of space rapidly diminishes.
10GB in that Alpha server seems like a massive amount of memory at the time. But DEC claimed capability of up to 28GB. An 8400 configured the way they had it was probably a half a million dollars, not including the disks of course.
You can see how far behind Intel hardware was at the time - the quad-CPU Pentium Pro boxes only had 256MB of memory! Sure, that was only a $40k box from Compaq...
Fun fact: At the time, Quake engine licensees were buying 4-way Alpha and Intel servers to do the level processing for first person shooters.
TerraServer hardware configuration parameters:
Max hits per day 40 million/day
Max SQL queries per day 37 million/day
Max image downloads / day 35 million/day
Bandwidth to Internet 200 Mbps = 2 Terabytes/day
Concurrent web connections = 6,000 connections
Web front ends: Six 4-way 200 Mhz Compaq Proliant 5500, .5GB ram
Database back-end 1 8-way 440Mhz Compaq AlphaServer 8400 10GB ram, 3.2 TB raid5
324 9GB Ultra SCSI disks
324 x 9GB disks for 2.9TB? You could store 1.2PB with the same number of 4TB disks. Doesn't seem like Moore's Law is holding up for storage, nor for internet speeds.
[0] http://www.math.uaa.alaska.edu/~afkjm/cs401/project_fall_04/...