Adding indexes with Padrino migrations
I really like the Padrino web framework. Its simplicity and layer of just enough above Sinatra appeals to me over the burgeoning complexity of Rails of late.
As my applications have become more complex though, the lack of documentation and good examples have become an increasing issue. I'm intending to write a few blog posts on common, yet more complex things, you need to do in Padrino to use it as a Rails replacement. Hopefully it can increase uptake and help other people with issues to solve some common gaps.
My app was getting to the size where database indexing was becoming a necessity.
Could I throw those in using padrino migrations without resorting to raw SQL
statements in the migrations? I couldn't tell. Googling gave me no love, but
since I was using
ActiveRecord one would assume it was supported. I couldn't
tell from the source code, so after a quick trip to
#padrino on IRC and a nice
chat with @wikimatze (where we both shrugged), I
pg_dump and then started some cowboy database experimenting.
Short story: It works and is supported, but it's slightly different if you're coming from indexes in Rails migrations. Here's what you need to know.
A basic column index
So, let's dive right into an example. I have a Species table, chock full of a distinct 6,379 animals. In general, there are a lot of lookups via the name, which is a unique combination of the genus and species. So, the Green And Black Poison Frog, one of the famous curare frogs, is Dendrobates auratus uniquely in the database. And that's what I need to create an index on.
So, I started with the following:
🚀 ~ padrino gen migration AddNameIndexToSpecies
which gave me a nice blank migration file which just
self.down, which I modified with the
change_table. The important thing to note is that you can't use the Rails migration syntax of
add_index. That just throws an error. You need to have it look a little something like this:
class AddNameIndexToSpecies < ActiveRecord::Migration def self.up change_table :species do |t| t.index :name, :unique => true end end def self.down change_table :species do |t| t.remove_index :name end end end
I wanted to make sure the index was unique since, at least in postgres, this will also add in an additional constraint preventing anyone inserting a duplicate record on the name which, with nearly 6500 species, is a bit of a concern. Plus it also is a double check on the data validations in the models.
padrino rake ar:migrate actually gives you a nice B-tree index on the name column in the species table.
INFO - Migrating to AddNameIndexToSpecies (37) DEBUG - (0.2ms) BEGIN == AddNameIndexToSpecies: migrating ========================================= -- change_table(:species) DEBUG - (125.8ms) CREATE UNIQUE INDEX "index_species_on_name" ON "species -> 0.1285s == AddNameIndexToSpecies: migrated (0.1286s) ================================ DEBUG - (1.3ms) INSERT INTO "schema_migrations" ("version") VALUES ('37') DEBUG - (2.1ms) COMMIT DEBUG - (0.3ms) SELECT "schema_migrations"."version" FROM "schema_migrati
You end up getting this output and bringing up the database console with
psql you can get confirmation that there is actually a new index on the species column.
aark_dev-# \d species Table "public.species" Column | Type | ------------------------------+-----------------------------+----------------- id | integer | not null default name | character varying(255) | common_name | text | analog | text | has_phylogenetic_study | character varying(255) | order | character varying(255) | family | character varying(255) | genus | character varying(255) | species | character varying(255) | ex_situ_research_needs_notes | text | created_at | timestamp without time zone | not null updated_at | timestamp without time zone | not null Indexes: "species_pkey" PRIMARY KEY, btree (id) "index_species_on_name" UNIQUE, btree (name)
Done. The speedup in my app was actually noticeable as well as confirmed by the difference in executing a page load. When people are looking at the species table, the main way they use the filtering is by name search.
So, that's pretty straightforward if you're at all used to putting in indices from Rails. How about a compound index? One that ends up using the combination of fields to look things up. In my database, since I have both assessments and conservation programs, both of which are a unique combo organization and species (and start date technically), and quite often they're being looked up to determine if a species is in captivity or in need of help in establishing a "bring back from the brink" program in terms of them being critically endangered. Maybe that's too esoteric though. Let's pick one that everyone can deal with, the Accounts or Users table.
In this app, we've just gone with peoples' emails as their primary identifier. Since the app is also international, we've also allowed people to identify themselves with two fields,
last_names in order to avoid the various issues around the way people describe themselves outside of the English speaking world. So, let's use that. In particular, since we have a full_name function in the app which parses those two things together, an index on the both of them would probably be quite useful. How do you get an index on two columns? Again, erring on the side of safety, I issue the command line
🚀 ~ padrino gen migration AddIndexesToAccounts
and then edit the resulting migration file to look a little something like this:
class AddIndexesToAccounts < ActiveRecord::Migration def self.up change_table :accounts do |t| t.index :email, :unique => true t.index [:first_names, :last_names] end end def self.down change_table :accounts do |t| t.remove_index :email t.remove_index :first_names_and_last_names end end end
To get compound indexes, you simply run the target columns together as an array of symbols. Running the migration, you get the surprising
index_accounts_on_first_names_and_last_names which I have to admit, I didn't see coming, so I had to alter the remove portion of this file to deal with that.
So far though, this is only giving you the same type of index, the B-tree. It's an excellent general purpose index for most searching and comparison, but particularly databases like PostgresSQL have other types of indexes available to them which assist things like array searching and specific applications such as full text searching.
How can you use them from Padrino? Well, at time of writing, as near as I can tell from diving down into 0.11.4, the dependency ceiling on
ActiveRecord (I'm currently at 3.2.14) and below 4.0 means you can't just pass a type of index in the migration unfortunately.
However, as soon as 0.12 has support for ActiveRecord 4.0+ I believe you should be able to use the following to be able to generate either a GIST or GIN type index to help your fulltext search in postgres or similar use case.
class AddNameIndexToSpecies < ActiveRecord::Migration def self.up change_table :species do |t| t.index :name, :unique => true, :using => :gin end end def self.down change_table :species do |t| t.remove_index :name end end end
We'll have to see if that bears out to be true on the 0.12 release.
Most likely though, you're probably fine with the B-tree index for most of what you want to do. Unless you know what you're doing, it's probably good to stick with that. Experiment at will though.
In any case, being able to add indexes from the migrations is a great, if undocumented, feature in Padrino so I'm hoping this closes the gap somewhat. I'll add similar notes to the migrations documentation as the padrino-core group pushes forward with revising documentation on the way to 0.12.