Make Your Application Stand on its Own

Posted by chrisp Sat, 01 Mar 2008 21:54:00 GMT

I’ve come across a number of web applications recently that cross-link between other applications to steal their functionality.

This sucks!

It leads to poor application design as sooner or later developers come along and are naturally confused about what should go where. They then increase the amount of cross-stitching between the applications. Eventually you end up with one large application that’s not so great at doing two things. You have to get both applications working independently when you just want to use one. It wastes time and it creates a mess. The more applications you connect this way, the worse it becomes. It lowers the quality of both applications and kills code velocity.

Stop the madness!

For those of us that are Rails developers there is a pre-determined place to put outside dependent functionality - the ‘vendor’ path. If you put all of your external dependencies there you have a nice self-contained application that can stand on its own. Functionality shared between applications can be broken out into mutual dependencies and be pulled into the respective applications as plugins, frameworks, or gems that are stored in the vendor path.

If you have a framework, put it into “vendor/framework.” This is where rails goes for instance (vendor/rails).

If you have a plugin, put it into “vendor/plugins.” It’s the standard plugin location for Rails.

If you have a gem, install gemsonrails and put it into “vendor/gems.”

Now everything is nice and tidy. All you have to do to get your application ready for development or deployment is to check it out. No gem installations or framework installations. The dependencies are already met as soon as you get the code.



Test Driven Development - the only way to roll 1

Posted by chrisp Thu, 13 Dec 2007 14:38:00 GMT

I started phasing into a TDD workflow about year and a half ago. It took awhile to learn. Fixtures and test:unit were the tools of the time (so 2006). I didn’t see much about mocking, stubbing, or even factories until I started getting into Rspec about 5 months ago. TDD definitely improved the quality of my code but it slowed me down. I started programming with BASIC in 1984 after all (TI-99 ftw) so I have over twenty years worth of habits to overcome. Even now the amount of time I have spent doing functional programming dwarfs everything else.

Developers in particular must learn how to constantly adapt. Software Engineering has a long way to go and phase-gate/serial cycles aren’t the direction it needs to go in. Unfortunately there is a lot of resistance to changes in programming languages and methodologies. Most people cling to the old ways, the ways they were thought. A professor they respected told them to UML model every detail before starting to code so that’s what they do. Most people don’t want to constantly re-learn how to do their job. If they’re a developer then they need to find a new career path. We need to learn better ways of doing things and TDD is one of the most important improvements.

These days I’m doing 100% TDD and will never go back. I finally have the work flow down and not only has my code quality dramatically improved, so has velocity. My team is doing all TDD now and we burn through development at amazing speed. Our QA team almost never finds any defects because the developers are just as thorough during the TDD process. It took awhile to get to this point, but TDD doesn’t slow me down anymore - it helps catapult me ahead.

How many times have you received a hunk of completely untested code? Yeah I lost count too. I imagine your experiences were like mine, I bunch of mangled code that made little or no sense that really just needed to be tossed. Instead you have to black box it and put in a bunch of wrapper functionality. It’s legacy code. We’ve all experienced legacy code and it sucks. The really sad part is that we all wrote the same type of code that others are now wrestling with.

Untested code is legacy code. Everyone seems to assume that OLD code is legacy code. This isn’t true. C now isn’t much different from what C was 30 years ago. If code written in C a long time ago came with a complete testing suite, then all you would need to do is get it compiling and running. After that you can gradually make changes and rely on the tests to help lead the way. Inheriting Java code from 2 years ago is no different. If it doesn’t have tests, it’s almost certainly going to be legacy code. That means black boxes, wrappers and pain.

TDD ensures maintainability, provides documentation, helps in the design process, exposes architectural flaws early (you might think you’re an awesome coder but, trust me - you still have them) and (once you get it down) actually speeds up development. It also gives your code a life support system. Someone that comes along several years later should be able to pick up where you left off.

Give it a try and stick with it (if you haven’t already). It’s the only way to roll.

URIs, URLs, Conventions and Experts

Posted by chrisp Tue, 11 Dec 2007 14:13:00 GMT

I always seem to be on the losing side of arguments that debate the finer points of the Universe. Sometimes the details are important but most of the time they’re just noise.

In these cases, I tend to follow the convention unless I have a strong reason not to. I don’t necessarily know how the convention came to be in all cases.

Unfortunately it’s surprisingly hard to convince others to do the same. Everyone likes to think they’re an expert. Expert is a subjective term though. As a developer, I picked the Ruby and Rails camps. I don’t necessarily blindly agree with everything that comes out of the mouths of Dave Thomas, Jim Weirich, and DHH, but I have learned to trust their judgment.

Finding mentors makes life easier. If you know you can trust the words of someone 90+% of the time, then you can build on what they know and not bother repeating the same arguments they have gone through. This is how “experts” work. Experts have a network of peers that they trust. That’s why they advance so much more quickly. They don’t bother repeating the details.

In order to become an expert you have to assume that you don’t already know everything.

I’ve encountered this situation once again. I’ve been following the RESTful Rails naming convention of using URIs where others use URLs. I did this because I trusted the judgment of the Rails Core team that this was the proper term. I always hated the interchangeability of the two so I was fine with just using one.

The debates came. There were inappropriate URIs strewn all over my code! These are supposed to be URLs I was told. I was asked WHY?! Why oh why have you strewn sloppy terminology all over our codes? My answer of that’s the Rails convention was not good enough. My peers wanted to replace all of these, I said “fine - if you really want to take the time on that.”

So… migrations are added, tests cases modified , source, documentation - all changed to reflect the “proper” naming scheme.

The truly ironic part. The initial naming scheme was right. That’s correct. All of that changed things to the WRONG terminology. It turns out that the Rails core team was right (imagine that). While it’s true that a URL must have an http, ftp, etc protocol at the beginning - it ALSO has to have an extension at the end that says how the resource should be presented. The concept is completely outdated now - which is why URL is considered obsolete terminology (just Google around for that).

I found the below… URI.. which describes the fine points of how the two differ.

http://ajaxian.com/archives/uri-vs-url-whats-the-difference

The lesson is to not assume you know more than everyone else. When a group of experts agree on something, maybe they’re right! Investigate the issue if you like, but don’t assume they’re wrong and you’re right.

Why I Converted to Search Servers

Posted by chrisp Mon, 10 Dec 2007 14:08:00 GMT

Many who have worked with me in the past know that I always resisted including search servers (ferret/lucene/solr) into my projects. I argued that they added unnecessary complexity and didn’t provide much that some fancy database searching couldn’t do.

I was wrong.

While I suppose that I still consider the above arguments to be valid, there is one more important argument on the side of using search servers - scalability. I share many rubyists’ disdain for the “S” word. It gets thrown around a lot and typically results in overbloated software that still doesn’t perform well even when “scaled.” There really isn’t a better word in this case though.

HTTP is incredible. As the recent RESTful advocates have pointed out, the protocol is infinitely scalable, simple, and is (in a large way) responsible for the explosive growth of the web. Web servers getting bogged down? No problem, just add another (as long as you’re not trying to do any - or at least not much - state tracking). You can keep adding servers to your heart’s content. A good admin can have this process down to a five minute setup.

Eventually all of those database calls add up though and your database starts straining under load. What do you do now? You can start optimizing your queries, do some fancy caching, add a slave database and keep a read-only connection open to it or (the mother of all evils) create a database cluster.

All of these options suck.

All of that beloved simplicity and scalability of HTTP is gone and replaced by infrastructure that’s extremely expensive and hard to maintain. Sometimes you just can’t get around the database cluster - but it should be a weapon of last resort. Query optimization isn’t so bad if done correctly, but it will only get you so far. Query optimization can also take nice clean code and turn it into a mangled nightmare. Caching is also good in moderation, but too much of it is also a code mangler and can make your application’s usability suffer (waiting 10 minutes for a user’s change to appear is often not usable). Database slaves are fine as failover/backup options but reading from one db and writing to the other also produces mangled code.

The solution? Search servers. Search servers can grow infinitely. You can run one on each web server or you can also have a centralized search server if you prefer. Whenever you pull results from the search server your database has avoided the blow. You’ll still probably need to hit the database occasionally, but your load will be dramatically lower. The load is on the web server end and can be shared across the web cluster without any fancy configuration. Once your search server is in place, use it for as much querying as you possibly can. The growth of your database load will be a nice slowly climbing linear line that will probably keep pace with processor speeds. If done right, you can avoid the extreme overhead involved with a database cluster.

Typo Pains 1

Posted by chrisp Sat, 08 Sep 2007 16:27:00 GMT

You may have noticed a lot of errors popping up recently around here. The problem has been how triggers work with Typo. It seems that publishing articles in the future will make Typo go kaput if you’re on Dreamhost. It doesn’t seem to happen anywhere else - there must be something unique to the Dreamhost environment.

I was able to fix this problem by deleting the content items in question, along with the related articles_tags and triggers. This required a combination of work in the typo console and mysql cli client to do. You have to find the id of the items in question. Delete the rows matching that id in the contents table, delete the rows that have that id for articles_id in the articles_tags table and delete the rows that have that id for pending_item_id in the triggers table.

What a great way to start the weekend! :)

Obviously this means that I had some articles coming that will be delayed now. I’ve read elsewhere that you can get around this by marking the publish date in the future but leaving the “publish” item checked.

Be a Zealot - Rails Edge Day 1

Posted by chrisp Fri, 24 Aug 2007 02:23:00 GMT

What a day! There were certainly some great talks at Rails Edge on the first day. This all happened during a storm that must have been an act of God. I certainly had no idea what was in store for me once I left to drive home… hopefully others got home OK. I’m just glad my car didn’t float away on me at any point (honestly I’m surprised it didn’t). I was very close to just bunking in the car at some random parking lot overnight. Had I known how bad it was I would have just stayed at the hotel…

Anyway… the #1 highlight I took from day 1 is: “Be Zealous.” There are many other important highlights, but I think this one is the most important. Chad Fowler did a superb job of going over how to produce “quick and clean” (as opposed to quick and dirty) code. It took a strong force of will to get Rails where it is today. It wouldn’t have happened without the strong opinions and determination of the core team. If we’re going to continue to make the world a better place for software development, then we need to stay true to our values. These are mainly MVC, CRUD, constraint-driven development, metaprogramming and domain-specific constructs and more recently REST.

From my perspective, I hope Ruby/Rails developers continue to evangelize for the cause - but I also hope that we can all recognize when we have turned down the wrong path. Too often I’ve heard the excuse of “yeah it’s a bad way to do it, but there are already thousands of applications in production that rely on that.” The end result is that the problem just becomes worse and worse.

No matter how far down the wrong road you’ve gone, turn back.

Maintaining a zealous stance for the Ruby and Rails core values while also being open to questioning our current path will obviously require a delicate balance. The forces may pull in different directions, but developers will need to learn how to walk the line. To quote Dave Thomas’ speech today, “An object is made from a class and a class is a type of object.” Patterns often become cyclical.

Dave Thomas’ speech on metaprogramming really cut to the chase on how frameworks like Rails are built from Ruby and how you can leverage this power for yourself. If you find yourself coding a non-trivial application, then you really should think of domain-specific ways to describe your application. You can do much more than just build an API, your application can (and probably should) consist of it’s own language in many places. The language is built on Ruby (and possibly Rails as well), but is pieced together in sections that are self-describing. If done properly, this will empower you with productive development throughout the product’s life-cycle. It may be that your framework is so good that it will survive future generations.

A passive observer probably would have thought today’s conference was really “Macbook Pro/iPhone Edge.” A slightly more observant person may have thought it was really Java-loves-Rails Edge. A majority of the crowd seemed to consist of people that were either currently Java developers looking into Ruby, or Java developers that have already made the jump. Add a great talk on JRuby into the mix and there’s a lot of Java floating around. I find this amazing considering the disregarding attitude that seemed to come from the Java community towards Ruby and Rails just a couple of years ago. This change is definitely good for both camps though. The number of Java developers swarming towards Ruby is very reminiscent of how C++ developers swarmed to Java circa 1999.

Personally, I never drank the coffee. It’s not that I didn’t think Java was a great technology, I just felt that it only went half way and didn’t really see the point in that. In my mind, if you wanted efficiency - then just use C++. If you wanted productivity and multi-platform functionality, then Perl and PHP really got things done much faster (despite their crappy OO support) - at least as far as web application development goes. A majority of us that started on Rails back in the beta, 1.0/1.1 days were PHP developers that believed in dynamic languages but also believed that there must be a better way.

Whatever the path, we seem to be converging on the same road now and this can only be a good thing.

Even though I don’t really care too much about Java, I was blown away by Justin Gehtland’s presentation on JRuby. This technology really has come far and has the enormous advantage of being able to run all of the bajillion of Java libraries out there in conjunction with Ruby (and Ruby on Rails) code. This technology has developed amazingly fast and I really wonder if this may end up being the Rails platform of choice down the line. It certainly makes the migration from Java to Ruby quite painless.

Marcel Molina quickly went over his “Presenter” concepts and lamented over the state of Rails tempting language options. It is definitely clear that ERB is near the end of it’s life-span in this role.

Both Marcel and Chad reiterated many times about the evils of putting code in your views. I have recently felt a little over-zealous about all of the times I’ve gone batty about ActiveRecord calls that are in a view. I haven’t even gotten to where I try thump the no-code-in-view philosophy yet. I definitely feel encouraged that this seems to be the perspective of the core team. I’ll also remember that being zealous is generally good…

Ezra Zygmuntowicz (man that’s a tongue twister) proved that stable extreme-volume rails environments are indeed a reality. His discussion on Xen was excellent. In short, the platform of choice is: Linux (via Xen) + Nginx + mongrel + monit. The swiftiply mongrel patch to mongrel is now stable and can greatly increase your throughput. If you don’t know these technologies yet, you owe it to yourself to get to know them.

Last but not least, Mike Mangino gave another great talk on “RESTful” Rails development. Simply helpful was covered, along with changes that are going into Edge Rails - to be Released with Rails 2.0. After talking over REST concepts with the other developers and listening to Mike’s speech, I walked away convinced that REST is indeed the way to go.

I’m looking forward to Day 2, provided that I can swim my way down there!

Rails Edge Bound

Posted by chrisp Wed, 22 Aug 2007 21:43:00 GMT

I’m looking forward to heading down to the Rails Edge conference in Chicago tomorrow (8/23/07). Tribune Interactive was nice enough to pay my way. I will be sharing my notes and experiences here, so stop by (or check the feed) occasionally.

Loving CachedModel/memcached

Posted by chrisp Sat, 26 May 2007 03:41:00 GMT

I’ve actually had CachedModel/Memcached running for quite some time, but I didn’t notice that it stopped automatically working for simple calls when Rails 1.2 came out. At the time we were just ramping up with our new dev team and the other Rails sites I had done were nearing the end of their useful life-span. So database load was low in general, a perfect time for Rails 1.2 to sneak one in on me.

Thanks to a combination of successful product launches and ever more dynamic websites (allowing for an ever decreasing amount of cacheable material) our main database server began to become stressed. Finally I noticed that memcached was in fact not caching much at all. After looking into it, I found this ticket describing the problem and (fortunately) a fix.

Once I fixed the problem (and one of the other devs added some key fragment caching) the load went down almost 90%!! It’s ironic that it took CachedModel breaking for me to be reminded just how great it is.

If you want to try it, follow this tutorial for setting up CachedModel. It’s still fairly relevant.

There’s a fancy new Acts as Cached plugin available that also abstracts memcached and may be any better.

If you’re running high-capacity sites, some type of memcached interface is a must-have!!

A hacked theme_support that (mostly) works

Posted by chrisp Mon, 26 Mar 2007 21:29:00 GMT

A theme should allow the look and feel of an application to change. A theme has a one-to-one relationship with the application, meaning that an application should only be running one theme at a time. A theme per user would be a ‘skin’ and an application that runs multiple sites with a different “theme” for each one is a whole new can of worms. I’ve had a need to do the latter recently, but I’ll go over my (site_support) solution to that in the next post. Meanwhile, here’s my solution for adding themes to your application.

When to use it: A single application connects to a single database and has multiple theme options, but only one is running at a time.

Start by installing my hacked version of theme_support

script/plugin install http://www.chrispcritter.com/theme_support_mod/ 

ActionMailer currently doesn’t work with it, I’ll have to tackle a fix for it soon. I’ll post my findings.

At this point you should at least be able to get your poject running in mongrel.

To create the them, you’ll need to create a “themes” folder in your project and start your first theme. The theme must have the following structure:

  $app_root
    themes/
      [theme_name]
        cache/
        images/
        stylesheets/
        javascripts/
        views/           
          layouts/      

You may notice that this is a little different than the structure described in the theme_support documentation. I could not get layouts to work outside of the view path (so I just moved it into views). I really think this is better anyway - since it matches the standard view structure. The layout I describe also has a “cache” folder. The theme_support plugin supposedly supports cached files in a public/themes structure but I had some difficulty getting this to work and really wanted to keep all theme related content in it’s own folder. I ended up setting page caching to go into the themes/[theme_name] folder with some mod_rewrite magic. It may be better to add a “public” folder to each theme for this at some point.

Add a line similar to the following to config/environment.rb (where is your default theme):

THEME = ENV["THEME"] ? ENV["THEME"] : '<DEFAULT>'

And add the below line to pull out your site specific configuration options.

CONFIG = YAML.load(File.read("#{RAILS_ROOT}/config/#{THEME}.yml"))

If you’re using apache, your rewrite rules will look like the following:

RewriteRule ^([^.]+)$ themes/<THEME>/$1.html [QSA]
RewriteRule ^images/(.*)$ themes/THEME>/images/$1 [QSA]

You will need to modify the caching paths in environment.rb so they point to the right place. Add the following lines:

ActionController::Base.fragment_cache_store = :file_store, "#{RAILS_ROOT}/themes/#{THEME}/cache"
ActionController::Base.page_cache_directory = "#{RAILS_ROOT}/themes/#{THEME}"

Add a link in public that points to your themes directory “public/themes -> ../themes/” That will preserve the routes.

It’s not the most elegant solution but it works. It bypasses some routes functionality and the sym-link in particular is ugly. I’ll post more mods to theme_support as/if I make them along with simpler install instructions, but this was really just a stepping stone to my site_support setup - so it may be awhile before I get back to it.

Using Proc objects with fragment caching 2

Posted by chrisp Tue, 13 Mar 2007 19:13:00 GMT

Caching a highly dynamic application can be hard. Page caching doesn’t work well because some elements of the page need to change where others don’t. Action caching helps in the very few instances where you want all of your before filters to run before loading the cached view. In this instance the before filters are typically redundant and may be doing some heavy lifting that can really slow things down. Fragment caching will cache a certain section of the view or layout but seems somewhat useless at first glance because all of the heavy lifting was already done in the controller.

How do you get around this problem? There are really only three options:

1) Put the necessary code into a model and call it in the view inside a fragment cache. This essentially by-passes the controller and completely breaks MVC rules.

2) Put the necessary code into a helper and call it in the view inside a fragment cache. This works in situations where a helper is necessary, but usually this isn’t what we’re looking for. Calling heavy lifting that’s in helpers from a view is not quite as ugly as calling from a model, but is generally bad design. Helpers should only manipulate existing data, not pull new data.

3) Pass a Proc object from the controller to the view, and call it from inside the fragment cache.

The third option is best. It doesn’t break MVC because the code that gets executed is only executed inside the controller argument. It’s execution is simply delayed until it gets called from within the view. It does dirty your views somewhat and probably won’t work for liquid templates, but it (mostly) follows the rules and gets the job done in a way that is more elegant than the first two options.

This is how you do it.

In this instance, I want to retrieve data that is used for the ‘category_links’ action and view (normally called as a partial). Notice that the Event model contains the actual code to retrieve the data. Recent trends in how to separate model and controller functionality have convinced me this is the way to go. Also note that the controller still lies between the view in model in this setup. See: skinny-controller-fat-model

def category_links
  # Set up a Ruby Proc method to pass (usually to a view).  This 
  # method is defined here but cannot be called until @get_event_categories.call 
  # is used.  This allows the logic execution to be deferred until view 
  # time, but it is defined here (where it should be).  This doesn't 
  # break MVC rules while maintaining a clean syntax.  See Ruby Proc Class    
  @get_event_categories = lambda do
    categories = Event.categories.collect {|c| [c.category, c.event_count, {
      :controller => 'events',
      :action => 'list',
      'event[category]' => c.category,
      'event_date[start_date]' => date_range_start,
      'event_date[end_date]'   => date_range_end}]}

    sub_categories = Event.sub_categories.collect {|c| [c.sub_category, c.event_count, {
        :controller => 'events',
        :action => 'list',
        'event[sub_category]' => c.sub_category,
        'event_date[start_date]' => date_range_start,
        'event_date[end_date]'   => date_range_end}]}  

    return categories, sub_categories
  end
end

@get_event_categories is defined as an object proc variable that contains the code needed to retrieve the categories and sub_categories variables. None of the code was actually executed. Ruby has just stored the code and the environment needed to run it in @get_event_categories.

For completeness sake, here is what gets called above via the Event API:

def categories
  conditions = Event.theme_filters

  find_hash = {}
  find_hash[:select] = 'events.id,category,count(events.id) AS event_count'
  find_hash[:conditions] = conditions if conditions && conditions != ''
  find_hash[:group] = 'category'
  find(:all, find_hash)      
end

def sub_categories
  conditions = Event.theme_filters
  conditions += ' AND ' if conditions && conditions != ''

  find(:all,
    :select => 'id,sub_category, count(events.id) AS event_count',
    :conditions => "#{conditions} sub_category != ''",
    :group  => 'sub_category')      
end

Here is the view code for category_links:

<% categories, sub_categories = @get_event_categories.call%>
<% categories.each do |category, event_count, cat_params| %>
  <%= link_to_remote category, {
    :url => cat_params,
      :loading => "Element.show('status')"}, {
    :href => url_for(cat_params)} %>
  (<%= event_count %>)
    <br />
<% end %>

<% sub_categories.each do |sub_category, event_count, sub_cat_params| %>
  <%= link_to_remote sub_category, {
    :url => sub_cat_params, 
      :loading => "Element.show('status')"}, {
    :href => url_for(sub_cat_params)}%>
  (<%= event_count %>)
    <br />
<% end %>

The @get_event_categories proc method is called first and the result is copied into categories, and sub_catagories variables. These are then used to generate the actual links.

No caching you say? Since these are almost always called as partials, I don’t cache here. It is assumed that a direct call wants a non-cached version.

My event layout contains the following rail section:

<% cache(:action_suffix => "rail_links")  do %>    
  <div class="lp15">
    <p class="txt_white">
        <%= render :partial => 'price_links', :layout => false %>
    </p>
    <p class="topic_purp">
      Browse by date
    </p>            
    <p class="txt_white">
      <%= render :partial => 'date_links', :layout => false %>
    </p>
    <p class="topic_purp">
      Browse by category
    </p>  
    <p class="txt_white">                   
        <%= render :partial => 'category_links', :layout => false %>
    </p>
    <p class="topic_purp">
      Browse by location
    </p>  
    <p class="txt_white">                      
      <%= render :partial => 'location_links', :layout => false %>
    </p>
    <p class="topic_purp">
      Browse by venue
    </p>          
    <p class="txt_white">          
      <%= render :partial => 'venue_links', :layout => false %>
    </p>
  </div>
<% end %>

Notice that the entire rail gets cached, and the category links (generated with our proc call) are among them. All of the other links have similar proc calls. I have a before_filter in my controller that retrieves the proc objects before going to the view (for the appropriate actions). If the cache is hit, then the proc methods are never called and no time is wasted on the expensive data pulls and calculations. You may notice that the category link calls are fairly expensive (they group and count the results, which adds a lot of time to the query). The other links are just as expensive. Doing this literally saves seconds on the page load speed.

So there it is. Try it out and marvel at the newfound speed of your application!

Older posts: 1 2