useful tools

What a difference a Gig makes

Tuesday, October 14th, 2008 | hardware, linux, useful tools | No Comments

We’re working on a project at the moment that involves deploying various Linux services for visualising Oceanographic modelling data using tools such as Unidata’s THREDDS Data Server (TDS) and NOAA/PMELS’s Live Access Server (LAS). TDS is a web server for making scientific datasets available via various protocols including plain old HTTP, OPeNDAP which allows a subset of the original datasets to be accessed and WCS. LAS is a web server which, using sources such as an OPeNDAP service from TDS, allows you to visualise scientific datasets, rendering the data overlaid onto world maps and allowing you to select particular variables from the data which you are interested in. In our case, the datasets are generated by the Regional Ocean Modeling System (ROMS) and include variables such as sea temperature and salinity at various depths.

The data generated by the ROMS models we are looking at uses a curvilinear coordinate system – to the best of my understanding (and I’m a Linux guy, not an Oceanographer, so my apologies if this is a poor explanation) since the data is modelling behaviour on a spherical surface (the Earth) it makes more sense to use the curvilinear coordinate system. Unfortunately, some of the visualisation tools, in particular LAS prefers to work with data using a regular or rectilinear grid. Part of our workflow involves remapping the data from curvilinear to rectilinear using a tool called Ferret (also from NOAA). Ferret does a whole lot more than regridding (and is, in fact, used under the hood by LAS to generate a lot of the graphical output of LAS) but in our case, we’re interested mainly in its ability to regrid the data from one gridding system to another. Ferret is an interesting tool/language – an example of the kind of script required for regridding is this one from the Ferret examples and tutorials page. Did I mention we’re not Oceanographers? Thankfully, someone else prepared the regridding script, our job was to get it up and running as part of our work flow.

We’re nearly back to the origins of the title of this piece now, bear with me!

We’re using a VMware virtual server as a test system. Our initial deployment was a single processor system with 1 GB of memory. It seemed to run reasonably well with TDS and LAS – it was responsive and completed requests in a reasonable amount of time (purely subjective but probably under 10 seconds if Jakob Nielsen’s paper is anything to go by). We then looked at regridding some of the customer’s own data using Ferret and were disappointed to find that an individual file took about 1 hour to regrid – we had about 20 files for testing purposes and in practice would need to regrid 50-100 files per day. I took a quick look at the performance of our system using the htop tool (like the traditional top tool found on all *ix systems but with various enhancements and very clear colour output). There are more detailed performance analysis tools (include Dag Wieers excellent dstat) but sometimes I find a good high-level summary more useful than a sea of numbers and performance statistics. Here’s a shot of the htop output during a Ferret regrid,

High kernel load in htop

What is interesting in this shot is that

  • All of the memory is used (and in fact, a lot of swap is also in use).
  • While running the Ferret regridding, a lot of the processor is being spent in kernel activity (red) instead of normal (green) activity.

High kernel (or system) usage of the processor is often indicative of a system that is tied up doing lots of I/O. If your system is supposed to be doing I/O (a fileserver or network server of some sort) then this is good. If your system is supposed to be performing an intensive numerical computation, such as here, we’d hope to see most of the processor being used for that compute intensive task, and a resulting high percentage of normal (green) processor usage. Given the above it seemed likely that the Ferret regridding process needed more memory in order to efficiently regrid the given files and that it was spending lots of time thrashing (moving data between swap and main memory due to a shortage of main memory).

Since we’re working on a VMware server, we can easily tweak the settings of the virtual server and add some more processor and memory. We did just that after shutting down the Linux server. We restarted the server and Linux immediately recognised the additional memory and processor and started using that. We retried our Ferret regridding script and noticed something interesting. But first, here’s another shot of the htop output during a Ferret regrid with an additional gig of memory,

Htop with high use processor time

What is immediately obvious here is that the vast majority of the processor is busy with user activity – rather than kernel activity. This suggests that the processor is now being used for the Ferret regridding, rather than for I/O. This is only a snapshot and we do observe bursts of kernel processor activity still, but these mainly coincide with points in time when Ferret is writing output or reading input, which makes sense. We’re still using a lot of swap, which suggests there’s scope for further tweaking, but overall, this picture suggests we should be seeing an improvement in the Ferret script runtime.

Did we? That would be an affirmative. We saw the time to regrid one file drop from about 60 minutes to about 2 minutes. Yes, that’s not a typo, 2 minutes. By adding 1 GB of memory to our server, we reduced the overall runtime of the operation by 97%. That is a phenomenal achievement for such a small, cheap change to the system configuration (1GB of typical system memory costs about €50 these days).

What’s the moral of the story?

  1. Understand your application before you attempt tuning it.
  2. Never, ever tune your system or your application before you understand where the bottlenecks are.
  3. Hardware is cheap, consider throwing more hardware at a problem before attempting expensive performance tuning exercises.

(With apologies to María Méndez Grever and Stanley Adams for the title!)

Tags: , , , , , ,

Stress testing a PC revisited

Thursday, September 25th, 2008 | hardware, linux, useful tools | No Comments

I’m still using mostly the same tools for stress testing PCs as when I last wrote about this topic. memtest86+ in particular continues to be very useful. In practice, the instrumentation in most PCs still isn’t good enough to identify which DIMM is failing most of the time (mcelog sometimes makes a suggestion about which DIMM has failed and EDAC can also be helpful, but in my experience there is lots of hardware out there which doesn’t support these tools well). The easiest approach I’ve found to date is to take out one DIMM at a time and re-run memtest86+ … when the errors go away you’ve found your problematic DIMM – put it back in again and re-run to make sure you’ve identified the problem. If you keep getting the errors regardless of which DIMMs are installed, you may be looking at a problem with the memory controller (either on the processor or the motherboard depending on which type of processor you are using) – if you have identical hardware, you should look at swapping the components into that for further testing.

Breakin is a tool recently announced on the beowulf mailing list which looks like it has a lot of potential also and I plan on adding it to my stress testing toolkit the next time I encounter a problem which looks like a possible hardware problem. What looks nice about Breakin is that it tests all of the usual suspects including processor, memory, hard drives and it includes support for temperature sensors, MCE logging and EDAC. This is attractive from the perspective of being able to fire it up, walk away and come back to check on progress 24 hours later.

Finally, we’ve found the Intel MPI Benchmarks (IMB, previously known as the Pallas MPI benchmark) to be pretty good at stress testing systems. Anyone conducting any kind of qualification or UAT on PC hardware, particularly hardware intended to be used in HPC applications should definitely be including
an IMB run as part of their tests.

Tags: , , , , , ,

Viruses and Malware on Windows

Tuesday, September 9th, 2008 | useful tools, windows | 2 Comments

Here I am writing about Windows – If I’m not careful, I’ll have to rename this blog to Thoughts on Windows. What’s the Linux angle here? I guess I’m the smug Linux user poking fun at Windows or something along those lines (but don’t leave just yet if you’re one of those smug Windows users, I’d be interested in your thoughts on the following).

Two unrelated events inspired this piece. I came across an interesting blog recently comparing the performance of various anti-virus products on a number of items of malware. I haven’t come across the guys behind this before, InfraGard but given their links to the FBI they seem to have some credibility so I’m assuming their testing methodologies are reasonably reliable.

Three things struck me about that blog,

  • AVG does a pretty good job of protecting Windows systems from malware and viruses (I know I’m starting to sound like an AVG fan-boy between this and my previous references to it).
  • Some of the “leading” anti-virus programs / suites are pretty poor at protecting Windows systems (not to mention the fact that they interfere with the operation of your computer).
  • You can’t rely on any anti-virus software to fully protect your Windows system.

That’s about the point where I become the smug Linux user, up until the point where I remembered that I have to look after my share of Windows systems both in our offices and for friends and family. This brings me on to the second recent event which inspired this piece.  A friend running Windows Vista had recently started getting worrying messages about things called Trojan-Spy.Win32.KeyLogger.aa trying to send traffic from his PC and wanted to know if he should be worried. “Probably”, I said and took a look at his system.

In the past, my toolbox for a healthy Windows PC would include the aforementioned AVG and, if I had concerns about spyware, Spybot – Search & Destroy – another great Windows tool that is free for non-commercial use. Between those two tools, I could be pretty confident that a Windows machine was running clean of any malicious software. So I installed and ran both on my friends PC – multiple times! Spybot even suggested running immediately after start-up as Administrator so that it could ferret out as much dodgy malware as possible. A few hours later, we were still being entertained by messages from Windows about our good friend Trojan-Spy.Win32.KeyLogger.aa (and maybe some others) which hadn’t even been detected by AVG or Spybot, never mind removed by them.

Some research on the interweb turned up posts and comments from various people who had encountered this particular trojan and by all accounts it’s a tough one to remove. I was on the verge of suggesting an OS re-install (taking inspiration from Aliens,  sometimes nuking the system from orbit is the only way to be sure) possibly in tandem with a Linux re-install to forever banish such nasties when I came across some references to another tool called Superantispyware which some recommended as the antidote to Trojan-Spy.Win32.KeyLogger.aa. With a name like that, it had to be good at dealing with spyware right? I figured it was worth a shot before we tried something more drastic, particularly since there is a free for non-commercial use version available. One download and install later, it kicks off and immediately warns us about some spyware it has found (either our friend the KeyLogger or another, as yet unknown, piece of spyware). After a half hour or so, it had finished a scan and proceeded to remove or quarantine all of the various pieces of spyware it had turned up. We booted the system once more, re-ran AVG and Spybot S&D and didn’t get any more warnings about Trojan-Spy.Win32.KeyLogger.aa. trying to send data off of the system. My friend was happy enough that the system was clean. Me? I’d probably still go and re-install the OS before putting my credit card details near the computer again (to be sure, to be sure) but the odds are it is clean – for which we probably have Superantispyware to thank.

So, what are our conclusions?

  • (With my smug Linux hat on once more) – consider installing and running Linux for your home desktop – a distribution such as the latest Ubuntu will provide all the software you need for typical day to day surfing, emailing and word-processing and won’t leave you open to half of this stuff (you’ll still be susceptible to phishing attacks and cross-site scripting attacks but you’ll be automatically eliminating a whole world of viruses, keyloggers and trojans which won’t ever run on a Linux system).
  • If you must run Windows, make sure you install some decent software to protect you – start with AVG, Spybot S&D (and maybe Superantispyware) – or let a comment to tell us about other useful ones.
  • If you’re running Windows, do not use the Administrator account for your activities, and don’t set up an alternative account with administrator privileges either – that kinda defeats the purpose. I know it’s a pain in the ass when you want to install some new software, but trust me, it’ll be a bigger pain in the ass when someone starts buying things from Itunes with your credit card.
  • Don’t click on things that you don’t understand and don’t install stuff from random web-pages, even if they do tell you it’s for your security (cmon, if some random stranger came to your door and told you he needed to “install something” in your bedroom “for your security” you’d slam the door in their face, before calling the police, why would you react differently to a stranger on the internet?).
  • Finally, the bad news is that email you just received claiming to be a red hot picture of Britney or Christina in a compromising position … well it probably isn’t (I know, if some international criminal ring is going to take over your computer for nefarious purposes you’d think they’d at least give you a naughty picture to take your mind off things, but I’m afraid they generally don’t play fair) so don’t click on the attached zip-file.

Tags: , , , , ,