Test Driven Development - the only way to roll

I started phasing into a TDD workflow about year and a half ago. It took awhile to learn. Fixtures and test:unit were the tools of the time (so 2006). I didn’t see much about mocking, stubbing, or even factories until I started getting into Rspec about 5 months ago. TDD definitely improved the quality of my code but it slowed me down. I started programming with BASIC in 1984 after all (TI-99 ftw) so I have over twenty years worth of habits to overcome. Even now the amount of time I have spent doing functional programming dwarfs everything else.

Developers in particular must learn how to constantly adapt. Software Engineering has a long way to go and phase-gate/serial cycles aren’t the direction it needs to go in. Unfortunately there is a lot of resistance to changes in programming languages and methodologies. Most people cling to the old ways, the ways they were thought. A professor they respected told them to UML model every detail before starting to code so that’s what they do. Most people don’t want to constantly re-learn how to do their job. If they’re a developer then they need to find a new career path. We need to learn better ways of doing things and TDD is one of the most important improvements.

These days I’m doing 100% TDD and will never go back. I finally have the work flow down and not only has my code quality dramatically improved, so has velocity. My team is doing all TDD now and we burn through development at amazing speed. Our QA team almost never finds any defects because the developers are just as thorough during the TDD process. It took awhile to get to this point, but TDD doesn’t slow me down anymore - it helps catapult me ahead.

How many times have you received a hunk of completely untested code? Yeah I lost count too. I imagine your experiences were like mine, I bunch of mangled code that made little or no sense that really just needed to be tossed. Instead you have to black box it and put in a bunch of wrapper functionality. It’s legacy code. We’ve all experienced legacy code and it sucks. The really sad part is that we all wrote the same type of code that others are now wrestling with.

Untested code is legacy code. Everyone seems to assume that OLD code is legacy code. This isn’t true. C now isn’t much different from what C was 30 years ago. If code written in C a long time ago came with a complete testing suite, then all you would need to do is get it compiling and running. After that you can gradually make changes and rely on the tests to help lead the way. Inheriting Java code from 2 years ago is no different. If it doesn’t have tests, it’s almost certainly going to be legacy code. That means black boxes, wrappers and pain.

TDD ensures maintainability, provides documentation, helps in the design process, exposes architectural flaws early (you might think you’re an awesome coder but, trust me - you still have them) and (once you get it down) actually speeds up development. It also gives your code a life support system. Someone that comes along several years later should be able to pick up where you left off.

Give it a try and stick with it (if you haven’t already). It’s the only way to roll.

Posted by chrisp Thu, 13 Dec 2007 14:38:00 GMT


URIs, URLs, Conventions and Experts

I always seem to be on the losing side of arguments that debate the finer points of the Universe. Sometimes the details are important but most of the time they’re just noise.

In these cases, I tend to follow the convention unless I have a strong reason not to. I don’t necessarily know how the convention came to be in all cases.

Unfortunately it’s surprisingly hard to convince others to do the same. Everyone likes to think they’re an expert. Expert is a subjective term though. As a developer, I picked the Ruby and Rails camps. I don’t necessarily blindly agree with everything that comes out of the mouths of Dave Thomas, Jim Weirich, and DHH, but I have learned to trust their judgment.

Finding mentors makes life easier. If you know you can trust the words of someone 90+% of the time, then you can build on what they know and not bother repeating the same arguments they have gone through. This is how “experts” work. Experts have a network of peers that they trust. That’s why they advance so much more quickly. They don’t bother repeating the details.

In order to become an expert you have to assume that you don’t already know everything.

I’ve encountered this situation once again. I’ve been following the RESTful Rails naming convention of using URIs where others use URLs. I did this because I trusted the judgment of the Rails Core team that this was the proper term. I always hated the interchangeability of the two so I was fine with just using one.

The debates came. There were inappropriate URIs strewn all over my code! These are supposed to be URLs I was told. I was asked WHY?! Why oh why have you strewn sloppy terminology all over our codes? My answer of that’s the Rails convention was not good enough. My peers wanted to replace all of these, I said “fine - if you really want to take the time on that.”

So… migrations are added, tests cases modified , source, documentation - all changed to reflect the “proper” naming scheme.

The truly ironic part. The initial naming scheme was right. That’s correct. All of that changed things to the WRONG terminology. It turns out that the Rails core team was right (imagine that). While it’s true that a URL must have an http, ftp, etc protocol at the beginning - it ALSO has to have an extension at the end that says how the resource should be presented. The concept is completely outdated now - which is why URL is considered obsolete terminology (just Google around for that).

I found the below… URI.. which describes the fine points of how the two differ.

http://ajaxian.com/archives/uri-vs-url-whats-the-difference

The lesson is to not assume you know more than everyone else. When a group of experts agree on something, maybe they’re right! Investigate the issue if you like, but don’t assume they’re wrong and you’re right.

Posted by chrisp Tue, 11 Dec 2007 14:13:00 GMT


Why I Converted to Search Servers

Many who have worked with me in the past know that I always resisted including search servers (ferret/lucene/solr) into my projects. I argued that they added unnecessary complexity and didn’t provide much that some fancy database searching couldn’t do.

I was wrong.

While I suppose that I still consider the above arguments to be valid, there is one more important argument on the side of using search servers - scalability. I share many rubyists’ disdain for the “S” word. It gets thrown around a lot and typically results in overbloated software that still doesn’t perform well even when “scaled.” There really isn’t a better word in this case though.

HTTP is incredible. As the recent RESTful advocates have pointed out, the protocol is infinitely scalable, simple, and is (in a large way) responsible for the explosive growth of the web. Web servers getting bogged down? No problem, just add another (as long as you’re not trying to do any - or at least not much - state tracking). You can keep adding servers to your heart’s content. A good admin can have this process down to a five minute setup.

Eventually all of those database calls add up though and your database starts straining under load. What do you do now? You can start optimizing your queries, do some fancy caching, add a slave database and keep a read-only connection open to it or (the mother of all evils) create a database cluster.

All of these options suck.

All of that beloved simplicity and scalability of HTTP is gone and replaced by infrastructure that’s extremely expensive and hard to maintain. Sometimes you just can’t get around the database cluster - but it should be a weapon of last resort. Query optimization isn’t so bad if done correctly, but it will only get you so far. Query optimization can also take nice clean code and turn it into a mangled nightmare. Caching is also good in moderation, but too much of it is also a code mangler and can make your application’s usability suffer (waiting 10 minutes for a user’s change to appear is often not usable). Database slaves are fine as failover/backup options but reading from one db and writing to the other also produces mangled code.

The solution? Search servers. Search servers can grow infinitely. You can run one on each web server or you can also have a centralized search server if you prefer. Whenever you pull results from the search server your database has avoided the blow. You’ll still probably need to hit the database occasionally, but your load will be dramatically lower. The load is on the web server end and can be shared across the web cluster without any fancy configuration. Once your search server is in place, use it for as much querying as you possibly can. The growth of your database load will be a nice slowly climbing linear line that will probably keep pace with processor speeds. If done right, you can avoid the extreme overhead involved with a database cluster.

Posted by chrisp Mon, 10 Dec 2007 14:08:00 GMT