LINQ to SQL (part 2)

Follow on from Part 1


The problem I find with these ‘drag your table on here and it all just works’ tools is……. extensibility. If you want to just have an editing tool then they’re great, but anything custom and it becomes really really hard (To the point where you end up binning it and hand rolling from the start again!). Not so with LINQ to SQL…


Validation


The data layer supports any kind of validation you like, and it just appears to your code like a normal exception. For example, if we want to prevent someone entering a date from the past, we can just go into the pregenerated designer class, find the OnValidate sub and put in our code:



 


Run our insert code with a date in the past, and this is what happens:



Very very neat solution and doesn’t rely on any complicated overrides.


 


 


Extra Custom Code


The other extensibility issue I normally find is when you want to ‘do something else’. Say you want to bill for copied MBytes through the backup service. Where would you put that in? You could do it in the main code, but you can also handle it in the data layer. For example, you can put some code in this ‘attach’ method:



 


Now whenever we call the Add method, our code runs and we can bill for those pesky copied MBytes.



Note the really cool part… in our parameters for the call to the billing routine, we’re grabbing info from both entities.


 

LINQ to SQL (part 1)

I’ve begun my learning escapade with LINQ to SQL. I sort of had an idea what it was about but had never really had a chance to play with it. I’m glad I took the time! There’s a million and one tutorials on the net so I won’t bother replicating one. I’ll just do a ‘snapshot’ of some of the features.


 


Diagram


First of all, you add the LINQ to SQL Class in Solution Explorer. Then drag a couple of tables from Server Explorer onto the design surface. You can then tell it what the relationship between the tables is (you don’t have to have set this in SQL!).



 


 


Classes


Then, when you hit save, it auto generates some classes for you that look like voodoo inside:



 


 


 


 


 


Getting Records


Now the cool part… the code window.  Check out how simply you can select a record and update it. Then see how we can select a record which would have required a SQL JOIN before. We’ve selected from the JobHistory table, where the related BackupJob has a Computer field which equals “Mimir”. It’s all automatic, I’ve not added anything else in. Very clever.



 


Adding Records


Adding a record in SQL is normally an utter PITA. Lack of intellisense means having to build a huge parameterised query. Not with LINQ to SQL!



 


Adding Relational Records


I’ve still got more digging to do into the product but this bit here has pretty much blown me away. This would have been a right nightmare in SQL. You’d have had to do an insert and return the SCOPE_IDENTITY, then insert the history rows using that. This makes is staggeringly easy. You just build the objects in memory, add the ‘child’ rows to the main row and off you go.



and to prove it works, here are the rows:


 



 


I have to say, I’m hugely impressed! I wish I’d known about it sooner. For performance, it’s undoubtedly better to stick to hand rolled SQL, but for those horrendously boring CRUD scenarios like saving user preferences, this will be a god send!!

Building a scalable, fault tolerant background processing system (part 2)

After having decided how I’ll roll my batch jobs up (see Part 1) I now needed a ‘thingy’ ™ to fire the jobs off.

This ‘thingy’ should probably be a .NET desktop app rather than a service so I can touch it and see it. This ‘thingy’ simply needs to look in a SQL table to fetch the ‘queue’ and then it needs to execute the jobs. But what if it crashes? Or that machine goes down? Simple.. we go multi server. We simply run the ‘thingy’ (let’s call it QueueProcessor from now on?) on each of the web servers in our web farm. We’ll put the ‘job’ web pages on each machine as well so they can simply call http://localhost/backgroundservices/dosomething.aspx

 

 

OK, but what do we do about polling? Since this is designed for scalability and resilience over outright speed/low latency you probably only need to check the queue once per second (and let’s be honest, if you wanted to be more responsive than that you wouldn’t poll at all, you’d find a way of pushing data instead). If it was a single server it’d be a piece of cake, but multi server we need to be cleverer.

We need to poll the queue every second, but from alternating machines

 

 

A few caveats though:

  1. The local times on the machines will be different.
  2. The network latency to the server can be different from machine to machine (if some are located elsewhere)

If the local times are out of sync you can’t possibly hope to get 2 servers to ping every second. If one server is 0.9 seconds out you’d get checks at 0.000, 0.100, 1.000, 1.100. Not ideal.

So I use SQL time as a central time, i.e. return GetDate() as part of the queue check. Then…  I also work out our network latency by figuring out the response time (we assume the query took 0ms on the server). Once I’ve done that, I then have an offset between local time and server time. I then have a timer than runs every 10ms, and just does a simple calculation using local time + the offset, and the number of running machines (grabbed by looking at how many machines have ‘checked in’) to determine if it should do anything.

Simple.

Expanding your programming skillset

I’ve followed a pretty simple education process when learning how to program:



  1. BBC Basic
  2. QuickBasic
  3. Acorn Basic
  4. Visual Basic 5
  5. Visual Basic 6
  6. ASP
  7. ASP.Net (VB)

Spot a theme there? Obviously along the way I’ve learnt SQL, picked up a bit of Javascript and enough to know that I’m dangerous with C. Side skills like HTML/CSS are a given.


But now what? For the last 4 years I’ve been trudging along with ASP.Net. Sure, it’s a MASSIVE language and I come across unknown functions every day, but it’s not really a challenge anymore. I want to learn something new. Languages are an obvious choice, but then you could also argue that some of the framework extensions in .NET are big enough to be classed as a language, or at least a dialect.


I’ve set myself a target of groking one language/technology per week, though I’m pretty sure some are going to take less time and others much much more. So what to learn?



  • PHP
  • Ruby
  • Ruby on Rails
  • LINQ
  • ADO Entity Framework
  • ASP.NET MVC
  • Javascript (like… learn it properly rather than my pidgin version)

PHP for example isn’t that ‘big’. Sure, there’s a few functions but so long as I can remember where to look for those functions I’ll be fine. Ruby On Rails looks hugely appealing for those “I need to knock this up quick sharp” moments.


So, if you had 2 months worth of spare time (say 1.5-3 hours per day) to learn some new skills what would you learn?



 

Voice File Storage (aka “storing a metric crap load of MP3s”)

We record every call we take for quality control reasons, and to protect our operators against nuisance callers etc. Anyway, everything is recorded by a rather ghetto (but very workable) system which records each call onto the soundcard of the workstation that’s taking the call. It’s then pushed into a queue on our main file server as a WAV file, the filename of which contains the ‘tag’ of the message for future matching up.


Then there’s a processor server than converts all these WAV files into MP3’s and dumps them onto a larger file server in this kind of fashion:


\bigassservervoicefiles200951thisfilename.mp3


This larger file server is currently running out of space however. It’s also used for our backup, and the 1.5GB/day of voice recordings is starting to take it’s toll.


So how should I store all these MP3s? The server doesn’t need to be fast at all. It probably writes 1gb day and reads maybe 100mb / day so it’s hardly stressed at all. Uptime isn’t an issue either. If it goes down, things just get queued up. The data however is quite ‘precious’. You certainly couldn’t replace it. The data is also incredibly unwieldy, you can’t just keep a 99% copy offsite easily.


A NAS device is probably the way forwards, but they come in so many flavours it’s difficult to know how to approach it. With a requirement of around 3TB of storage, there’s a few options available. I could buy an expensive single NAS, put my faith in RAID5 and cross my fingers that the device itself won’t break. I could buy a pair of cheap 2 bay NAS devices, run then with no RAID on at all and mirror the data across the boxes. This takes away the potential of the device breaking, but means that the devices WILL fail and at some point I will have to rebuild from the other box. Or you could spend even more cash and do a hybrid of those. A decent RAID5 3TB box, and a cheap 3TB non RAID mirror. 


Option 1 – ‘High End’ NAS device


NetGear ReadyNAS 1100 4x1TB Rackmount Network Storage (NAS) RNR4410-100EUS  – £1,566.99 inc vat


Pros:  Plug and play, 3TB of RAID5 storage, should work fine.
Cons: Could dump the data if the box goes pop or the RAID volume goes pop


Option 2 – 2 x low end NAS devices


2 x NetGear ReadyNAS Duo + 4 x 1.5TB drives  =  (£205 + £200)  x 2 = £810 inc vat


Pros: Protected against device failure.
Cons: A drive failure (which WILL happen) will take the whole device down


Option 3 – 1 x medium and 1 x low end


NetGear ReadyNAS NV+ 4 Bay + 4x1TB  = £459 + (4 x £80) = £779
NetGear ReadyNAS Duo + 2 x 1.5TB drives = £405
total of £1184 inc vat


Pros: Protected against device failure AND disk failure on primary device. Can spec 4 bay with 1.5TB drives to get an extra 1.5TB of unmirrored space.
Cons: Little bit more pricey than option 2, more work to admin than option 1. 


So which option would you go for, and why?


If money was no object I’d have one of these by now…..


Building a scalable, fault tolerant background processing system (part 1)

So, you have a big web based system dealing with some pretty large databases. Do you:

  1. Do all the big database operations directly in your web app as the user clicks the button, making them wait for a good few seconds

or

  • Do just the bare minimum (like display “Thanks” !) and farm the job out to another process somewhere

Yep, I thought so to.

So I set about building a generic system that can run these background tasks, the idea being I can simply pass jobs from any of my apps into a ‘queue’ to be processed by this background system.

Design criteria:

  • Fault tolerant (if we’re relying on this for doing things asynchronously to the UI it needs to be stable)
  • Scalable (we don’t want the queues backing up!)
  • Easy to create new types of job (The scheduler itself shouldn’t need to know anything about the jobs themselves)

After pondering the problem for a while, I decided the best way to do this was to treat each process as a webpage. Quite a few reasons for this: They’re language independent, they’re essentially multi threaded by nature, they’re discreet so if one web call fails it doesn’t bring down the whole process, they’re server independent (I can call jobs on other servers). I could simply call a bunch of web pages with the appropriate query string like so:

dobillingstuff.aspx?rowid=123
sendmessage.aspx?messageid=5678

Thinking along those lines, what if I split up big jobs into little jobs? I could have a ‘master’ job that then churned out a load of smaller ‘child’ jobs…

compilestatsforusers.aspx

…which then generates these queue requests that do the donkey work…

compileuserstats.aspx?userid=1
compileuserstats.aspx?userid=2
compileuserstats.aspx?userid=3
compileuserstats.aspx?userid=4
compileuserstats.aspx?userid=5

…etc…

Now, all I need is a way of executing these jobs……

I *must* be able to do this?

I read this yesterday on Scott Hanselmans blog and it struck a bit of a DIY chord with me. We’re currently doing our garden up from being an overgrown jungle to something hopefully resembling the kind of thing you see on TV and go ‘Oooo’ and  ‘Aaah’.


Now, not having set foot in a garden before (actually, that’s a lie. I have, just not with the intent of doing anything) I figured the best way was to pay someone to come along, design it, and just do it. I’d then write a big cheque and we’re done.


Well, that was until we got ‘The Quote’. Screw that then, I better learn me some garden skills. Now, as a .NET developer who spends all day solving little problems to create bigger things (hah, I avoided the architecture word!) I figured it can’t be that hard can it? After all, if little old ladies can make amazing gardens then I should be able to…


Things I learnt very fast:



  1. Digging huge amounts of ground is *HARD*. It’s ball busting, back breaking work. Pay someone else to do this bit.
  2. The woman in your life will always without fail have better ideas about what to do with the garden than you will. Fact.
  3. Even if you’re creating a little seating area down the bottom, there’s no place in the garden where it makes sense to run some Cat-5 ‘just in case’. This is why God invented Wi-Fi.


Aside from the really mundane stuff it was quite a rewarding task, and very interesting to see how computer developer skills translated into doing something manual. For example, working out how many paving slabs I need for a bit of path. Well, that’s pretty much like an array isn’t it? Filling wheelbarrows and taking them to the skip, do you take fewer heavier loads and risk all the soil falling over, or smaller lighter ones? Seems like network packets to me.


Moral of the story is, if you have the smarts to do something like fix engines, design rockets, fix people or just design software then you’re clearly switched on enough that doing anything else on a similar level but outside your sphere is possible. It may take a bit of research and practise, but it’s well within reach.


 

 


 


(mind you… I haven’t tried to build the shed yet!)