This was a talk from WNYRB I gave in 2012. It was basically a few anecdotes about connecting to Legacy Databases from a rails application which happens to be a common theme for me lately.
“Don’t Do It” Saturday, April 6, 13 Connecting to Microsoft SQL and other non open source rdbms. I found this statement to be very jarring until I actually had to connect to something other than a friendly db and schema. This highlights the ruby centric view that seems prevalent in the community. Most database engines can do a lot of cool stuff and way too many developers seem to avoid them in favor of abstracting the direct interaction away. While I do not disagree with this approach I think it tends to skew solutions away from the database itself for the sake of data agnosticity or developer comfort. There are a lot of database features that can make a database more rails friendly and we’ll talk about a few.
here> that existed at least 30 seconds before you decided to start your project Inconsistent Schema Poor naming conventions Non rails friendly decisions Data Source without a friendly ruby wrapper Not Invented Here Invented here before I got here ... Saturday, April 6, 13
• Data Patterns (ETL, Service API, SQL Views, Warehouse) • Glue Tools (Jruby, MacRuby, Ruby Extensions...) Saturday, April 6, 13 Consider the scope of your application to be much larger than the stack you’ve chosen around the application itself. You don’t need to use jruby for the application itself to use a java based library to implement your data transfer pattern.
Rules • Triggers • Materialized Views (if available) Saturday, April 6, 13 The point here is that you aren’t stuck with what you have. You may not be able to change the current schema but can you add to it and augment the schema? Can you get access to a read only mirror? Can you add the tables you need, in a rails friendly way in to the same database with the badly formed data?
this in MSSQL and PG. Possibly others. But aren’t views very slow? You can index a view in MS and postgres. You can also use tools like thinking_sphinx on the models if you are using AR.
(25+) • Databases are legacy in age and schema • Maintain a significant amount of ‘local’ information Saturday, April 6, 13 25 + Different MSSQL Databases that grew via copy pasta development. Each schema was similar enough to be recognizeable as almost the same app but inconsistent enough so to be impossible to generalize in a single set of models/mappings.
but represent similar data • No control over source data • Each DB has millions of records • All data will be (ET)L’d into the same place • The Legacy Schemas are very poor Saturday, April 6, 13
Must implement all current features before launch • Database is not rails friendly I want the new app to work exactly like the old app but with feature a,b...z we can’t switch over until you are done and the switch better be seamless Saturday, April 6, 13 The first instinct here is to be happy that you get to start from scratch your schema can be clean and free of crud. The reality is that once you finish the application you still need to write a single use ETL to transfer the data from the old database to the new database. The client,boss,end user is going to expect every feature they used to have to be available in the new application and you are looking at a potentially very painful switchover process.
the things we have discussed • Start with features they don’t have to make something useful right away • Try to avoid a forced waterfall approach Saturday, April 6, 13 By avoiding the switchover you can stick with a more iterative development process
in a ‘Toy’ environment (see Access) • No good wrapper exists for the data source (Alpha 5, Access, Goldmine) • Xerox ++ is also in effect Saturday, April 6, 13
log files with a set format • Display results as close to real time as possible • Maintain results in AR friendly schema to be consumed by a rails app • Process potentially millions of log lines per Saturday, April 6, 13 In this case we don’t even have a nice data source, but just a bunch of text files in several data centers around the world. Fortunately they were in a pretty static format.
Light Weight Log reader service for each log • Service to post the logs to • Triggers to split the log data as needed Saturday, April 6, 13 In this case Wayne wrote the log reader services so of course it was written in bash but we can pretend it was a ruby for the sake of this talk.