Thursday, June 18, 2009

has_and_belongs_to_many or has_many :through?

Well, I had been quite confusing about these two topics before I read some rails books. These are just different ways to do many to many relationships in ActiveRecord.

has_and_belongs_to_many (habtm)
habtm is the very old way since rails 1.2. It creates a link between associated models through an intermediate join table.

class CreateProjectsProgrammers < ActiveRecord::Migration
   def self.up
      create_table :projects_programmers, :id => false do |t|
         t.column :project_id, :integer, :null => false
         t.column :programmer_id, :integer, :null => false
      end
   end
   def self.down
      drop_table :projects_programmers
   end
end
class Programmer < ActiveRecord::Base
   has_and_belongs_to_many :projects # foreign keys in the join table
end
class Project < ActiveRecord::Base
   has_and_belongs_to_many :programmers # foreign keys in the join table
end

Note that an id primary key is not needed in the join table and there is no join model, only join table. Here we will face a problem when we want to add extra columns on the join table. If it is in rails 1.2, we would have used push_with_attributes to do this. However, push_with_attributes has been deprecated in favor of a far more powerful technique, where regular Active Record models are used as join tables (remember that with habtm, the join table is not an Active Record object).

To conclude, habtm is a simple way to do a many-to-many relationship using a join table when the join table doesn't have extra columns. You will need to upgrade the relationship to use has_many :through once you need to add additional columns.

has_many :through
Records in the join table of habtm implementation has no independent existence. Later, we will find it very soon that the join table has a life of its own and should have a model when we add extra columns on that join table. Let's talk about relationship between article, user, and the join model is reading.

When a user reads an article, we can record the fact.

class Article < ActiveRecord::Base
   has_many :readings
end
class User < ActiveRecord::Base
   has_many :readings
end
class Reading < ActiveRecord::Base
   belongs_to :article
   belongs_to :user
end

reading = Reading.new
reading.rating = params[:rating]
reading.read_at = Time.now
reading.article = current_article
reading.user = session[:user]
reading.save


Here we lost what habtm solved. We could not ask a user which articles that they has read and vice versa. To solve this, use :through options inside has_many.

class Article < ActiveRecord::Base
   has_many :readings
   has_many :users, :through => :readings
end
class Reading < ActiveRecord::Base
   belongs_to :article
   belongs_to :user
end
class User < ActiveRecord::Base
   has_many :readings
   has_many :articles, :through => :readings
end

Now, you could do query both direction:

readers = an_article.users
articles = a_reader.articles


Unlike a normal has_many, ActiveRecord won’t let us add an object to the the has_many :through association if both ends of the relationship are unsaved records. The create method saves the record before adding it, so it does work as expected, provided the parent object isn’t unsaved itself. To add extra attributes:


user.readings.create(:read_at => Time.now,
         :rating => params[:rating],
         :article => Article.new)


Choosing which way to build a many-to-many relationship is not always simple. If you need to work with the relationship model as its own entity, use has_many :through. Use has_and_belongs_to_many when working with legacy schemas or when you never work directly with the relationship itself.

http://blog.hasmanythrough.com/2006/4/20/many-to-many-dance-off
http://blog.hasmanythrough.com/2006/4/17/join-models-not-proxy-collections

2 comments:

Joe said...

Thanks for the nice overview. I'm in the process of changing my join tables from HABTM to has_many :through so I can add additional fields. Would there be any benefit to add a primary key in the join table?

Thanks again

chamnap said...

Well, like I said in the post, if you use has_many :through, you need to create a model. A model usually needs a primary key to be able to work as an individual entity. However, if you uses HABTM, you won't need to create a model and thus you can access only from the relationship of other models.

I would say the only benefit of adding a primary key to the join table is to be able to work as a single entity without dependency of other models.

Subscribe in a Reader