<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>SAPessi &#187; SQL</title>
	<atom:link href="http://sapessi.com/tag/sql/feed/" rel="self" type="application/rss+xml" />
	<link>http://sapessi.com</link>
	<description>Perfection of means and confusion of aims...</description>
	<lastBuildDate>Wed, 10 Aug 2011 07:36:04 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Tracking database records history</title>
		<link>http://sapessi.com/2009/11/tracking-database-records-history/</link>
		<comments>http://sapessi.com/2009/11/tracking-database-records-history/#comments</comments>
		<pubDate>Tue, 03 Nov 2009 12:40:14 +0000</pubDate>
		<dc:creator>Stefano Buliani</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[History]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://sapessi.com/?p=288</guid>
		<description><![CDATA[If you Google for this you&#8217;ll find that the easiest way to provide a full audit of everything that happened in your database is to create a duplicate of each table you need history and insert a copy of the record you are changing in there for every update. For example if you have a [...]<!-- Easy AdSense V2.82 -->
<!-- Post[count: 2] -->
<div class="ezAdsense adsense adsense-leadout" style="text-align:center;margin:12px;"><script type="text/javascript"><!--
google_ad_client = "pub-8456780651289352";
/* 468x60, created 11/24/09 */
google_ad_slot = "7140896000";
google_ad_width = 468;
google_ad_height = 60;
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script></div>
<!-- Easy AdSense V2.82 -->

]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F&amp;source=sapessi&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>If you Google for this you&#8217;ll find that the easiest way to provide a full audit of everything that happened in your database is to create a duplicate of each table you need history and insert a copy of the record you are changing in there for every update.</p>
<p>For example if you have a table called client</p>
<div class="codesnip-container" >
<div class="sql codesnip" style="font-family:monospace;"><span class="kw1">CREATE</span> <span class="kw1">TABLE</span> client <span class="br0">&#40;</span><br />
&nbsp; id INTEGER<span class="sy0">,</span><br />
&nbsp; firstname VARCHAR<span class="br0">&#40;</span>255<span class="br0">&#41;</span><span class="sy0">,</span><br />
&nbsp; lastname VARCHAR<span class="br0">&#40;</span>255<span class="br0">&#41;</span><span class="sy0">,</span><br />
&nbsp; datecreated TIMESTAMP<br />
<span class="br0">&#41;</span>;</div>
</div>
<p>You will also create a table called client_history</p>
<div class="codesnip-container" >
<div class="sql codesnip" style="font-family:monospace;"><span class="kw1">CREATE</span> <span class="kw1">TABLE</span> client_history <span class="br0">&#40;</span><br />
&nbsp; id INTEGER<span class="sy0">,</span><br />
&nbsp; firstname VARCHAR<span class="br0">&#40;</span><span class="nu0">255</span><span class="br0">&#41;</span><span class="sy0">,</span><br />
&nbsp; lastname VARCHAR<span class="br0">&#40;</span><span class="nu0">255</span><span class="br0">&#41;</span><span class="sy0">,</span><br />
&nbsp; datecreated TIMESTAMP<span class="sy0">,</span><br />
&nbsp; history_creation TIMESTAMP<span class="sy0">,</span><br />
&nbsp; history_user INTEGER <span class="co1">&#8211; A foreign key to the user who updated this record if required</span><br />
<span class="br0">&#41;</span>;</div>
</div>
<p>You&#8217;ll then attach an onUpdate trigger to the client table inserting all the data in your history table. This solution gives you the ability to keep track of everything that happens in your database, when it happened and who-done-it.<br />
What this solution doesn&#8217;t deal with is schema changes. Also querying the state of a record at a point in time to join it with other tables might be a bit tricky.</p>
<p>Let&#8217;s start from the first point. <strong>Schema changes</strong>.<br />
The most obvious and simple solution is to put in place a proper process for all database schema changes. i.e. whoever touches the schema is also in charge of updating the history table structure and the trigger on the main table. Doable but a bit too prone to human error if you ask me.</p>
<p>What I generally do is use a small function/stored procedure to create the history on a table. This procedure is in charge of both creating the history table the first time it&#8217;s called and updating the structure if the main table has changed.<br />
In this example I&#8217;m going to use <a href="http://www.postgresql.org/" target="_blank">PostgreSQL</a>. First because it&#8217;s a database I&#8217;m familiar with and more importantly being open-srouce you can just download it and try this yourself.</p>
<p>My example here is a bit verbose but it serves its purpose. There are many database-specific instructions you could use to extract a table structure or perform some of the simple tasks of this function. However, what I&#8217;m trying to demonstrate is the concept and not PostgreSQL trickery.<br />
Most relational databases store the schema information in internal tables (pg_attribute, pg_class and pg_namespace in this case) so that&#8217;s what we are going to use to read the structure of our original table.</p>
<div class="codesnip-container" >
<div class="sql codesnip" style="font-family:monospace;"><span class="co1">&#8211; This function expects as a parameter the name of the table you want to</span><br />
<span class="co1">&#8211; create history for and a unique identifier for the record.</span><br />
<span class="co1">&#8211; in my case all tables have an ID column. Alternatively you could hand</span><br />
<span class="co1">&#8211; as a parameter also the name of the unique id column</span><br />
<span class="kw1">CREATE</span> <span class="kw1">OR</span> <span class="kw1">REPLACE</span> <span class="kw1">FUNCTION</span> create_history<span class="br0">&#40;</span>tablename VARCHAR<span class="br0">&#40;</span>255<span class="br0">&#41;</span><span class="sy0">,</span> recordid INTEGER<span class="br0">&#41;</span> RETURNS <span class="kw1">BOOLEAN</span> <span class="kw1">AS</span> $$<br />
DECLARE<br />
&nbsp; vhisttablename NAME; <span class="co1">&#8211; history table name</span><br />
&nbsp; vhistfieldcount INTEGER; &nbsp;<span class="co1">&#8211; number of fields in history table</span><br />
&nbsp; vtablefieldcount INTEGER; <span class="co1">&#8211; number of fields in main table</span><br />
&nbsp; vtmprowcount INTEGER; <span class="co1">&#8211; temporary variable to store query results</span><br />
&nbsp; vcurfield RECORD; <span class="co1">&#8211; variable to loop over fields of main table to create history</span><br />
&nbsp; vhisttablesql TEXT; <span class="co1">&#8211; sql to create history table</span><br />
&nbsp; vfieldlist TEXT; <span class="co1">&#8211; list fo fields in main table</span><br />
&nbsp; vhisttablefields TEXT; <span class="co1">&#8211; list of fields in history table</span><br />
&nbsp; vhistinsertsql TEXT; <span class="co1">&#8211; SQL for insert statement</span><br />
&nbsp; vtmptablename VARCHAR<span class="br0">&#40;</span>255<span class="br0">&#41;</span>; <span class="co1">&#8211; temp variable to check if history exists</span><br />
BEGIN<br />
&nbsp; vhisttablename :<span class="sy0">=</span> tablename<span class="sy0">||</span><span class="st0">&#8216;_hist&#8217;</span>;</p>
<p>&nbsp; <span class="co1">&#8211; Check if the history table exists, if not we need to create it</span><br />
&nbsp; <span class="kw1">SELECT</span> <span class="kw1">INTO</span> vtmptablename relname<br />
&nbsp; <span class="kw1">FROM</span> pg_class<br />
&nbsp; <span class="kw1">WHERE</span> relname <span class="sy0">=</span> vhisttablename;</p>
<p>&nbsp; vfieldlist :<span class="sy0">=</span> <span class="st0">&#8221;</span>;<br />
&nbsp; vhisttablefields :<span class="sy0">=</span> <span class="st0">&#8221;</span>;</p>
<p>&nbsp; <span class="co1">&#8211; count the fields in the history/current table if history exists</span><br />
&nbsp; <span class="kw1">IF</span> vtmptablename <span class="kw1">IS</span> <span class="kw1">NOT</span> <span class="kw1">NULL</span><br />
&nbsp; THEN<br />
&nbsp; &nbsp; <span class="kw1">SELECT</span> <span class="kw1">INTO</span> vhistfieldcount count<span class="br0">&#40;</span><span class="sy0">*</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="kw1">FROM</span> pg_attribute <span class="kw1">AS</span> a<br />
&nbsp; &nbsp; <span class="kw1">INNER</span> <span class="kw1">JOIN</span> pg_class <span class="kw1">AS</span> c <span class="kw1">ON</span> <span class="br0">&#40;</span>c<span class="sy0">.</span>oid <span class="sy0">=</span> a<span class="sy0">.</span>attrelid<span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="kw1">INNER</span> <span class="kw1">JOIN</span> pg_namespace <span class="kw1">AS</span> n <span class="kw1">ON</span> <span class="br0">&#40;</span>n<span class="sy0">.</span>oid <span class="sy0">=</span> c<span class="sy0">.</span>relnamespace<span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="kw1">WHERE</span> a<span class="sy0">.</span>attnum <span class="sy0">&gt;</span> 0<br />
&nbsp; &nbsp; <span class="kw1">AND</span> c<span class="sy0">.</span>relname <span class="sy0">=</span> vhisttablename <span class="co1">&#8211; history table name</span><br />
&nbsp; &nbsp; <span class="kw1">AND</span> <span class="kw1">NOT</span> a<span class="sy0">.</span>attisdropped<br />
&nbsp; &nbsp; <span class="kw1">AND</span> pg_table_is_visible<span class="br0">&#40;</span>c<span class="sy0">.</span>oid<span class="br0">&#41;</span>;</p>
<p>&nbsp; &nbsp; <span class="kw1">SELECT</span> <span class="kw1">INTO</span> vtablefieldcount count<span class="br0">&#40;</span><span class="sy0">*</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="kw1">FROM</span> pg_attribute <span class="kw1">AS</span> a<br />
&nbsp; &nbsp; <span class="kw1">INNER</span> <span class="kw1">JOIN</span> pg_class <span class="kw1">AS</span> c <span class="kw1">ON</span> <span class="br0">&#40;</span>c<span class="sy0">.</span>oid <span class="sy0">=</span> a<span class="sy0">.</span>attrelid<span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="kw1">INNER</span> <span class="kw1">JOIN</span> pg_namespace <span class="kw1">AS</span> n <span class="kw1">ON</span> <span class="br0">&#40;</span>n<span class="sy0">.</span>oid <span class="sy0">=</span> c<span class="sy0">.</span>relnamespace<span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="kw1">WHERE</span> a<span class="sy0">.</span>attnum <span class="sy0">&gt;</span> 0<br />
&nbsp; &nbsp; <span class="kw1">AND</span> c<span class="sy0">.</span>relname <span class="sy0">=</span> tablename <span class="co1">&#8211; main table</span><br />
&nbsp; &nbsp; <span class="kw1">AND</span> <span class="kw1">NOT</span> a<span class="sy0">.</span>attisdropped<br />
&nbsp; &nbsp; <span class="kw1">AND</span> pg_table_is_visible<span class="br0">&#40;</span>c<span class="sy0">.</span>oid<span class="br0">&#41;</span>;<br />
&nbsp; END <span class="kw1">IF</span>;</p>
<p>&nbsp; <span class="co1">&#8211; Get all the attributes and their type from the original table</span><br />
&nbsp; <span class="kw1">FOR</span> vcurfield <span class="kw1">IN</span><br />
&nbsp; <span class="kw1">SELECT</span> a<span class="sy0">.</span>attname <span class="kw1">AS</span> <span class="kw1">COLUMN</span><span class="sy0">,</span> format_type<span class="br0">&#40;</span>a<span class="sy0">.</span>atttypid<span class="sy0">,</span> a<span class="sy0">.</span>atttypmod<span class="br0">&#41;</span> <span class="kw1">AS</span> datatype<br />
&nbsp; <span class="kw1">FROM</span> pg_attribute <span class="kw1">AS</span> a<br />
&nbsp; <span class="kw1">INNER</span> <span class="kw1">JOIN</span> pg_class <span class="kw1">AS</span> c <span class="kw1">ON</span> <span class="br0">&#40;</span>c<span class="sy0">.</span>oid <span class="sy0">=</span> a<span class="sy0">.</span>attrelid<span class="br0">&#41;</span><br />
&nbsp; <span class="kw1">INNER</span> <span class="kw1">JOIN</span> pg_namespace <span class="kw1">AS</span> n <span class="kw1">ON</span> <span class="br0">&#40;</span>n<span class="sy0">.</span>oid <span class="sy0">=</span> c<span class="sy0">.</span>relnamespace<span class="br0">&#41;</span><br />
&nbsp; <span class="kw1">WHERE</span> a<span class="sy0">.</span>attnum <span class="sy0">&gt;</span> 0<br />
&nbsp; <span class="kw1">AND</span> c<span class="sy0">.</span>relname <span class="sy0">=</span> tablename<br />
&nbsp; <span class="kw1">AND</span> <span class="kw1">NOT</span> a<span class="sy0">.</span>attisdropped<br />
&nbsp; <span class="kw1">AND</span> pg_table_is_visible<span class="br0">&#40;</span>c<span class="sy0">.</span>oid<span class="br0">&#41;</span><br />
&nbsp; LOOP<br />
&nbsp; &nbsp; <span class="co1">&#8211; populate lists of fields both for history creation and select from main table</span><br />
&nbsp; &nbsp; vhisttablefields :<span class="sy0">=</span> vhisttablefields<span class="sy0">||</span>vcurfield<span class="sy0">.</span><span class="kw1">COLUMN</span><span class="sy0">||</span><span class="st0">&#8216; &#8216;</span><span class="sy0">||</span>vcurfield<span class="sy0">.</span>datatype<span class="sy0">||</span><span class="st0">&#8216; NULL, &#8216;</span>;<br />
&nbsp; &nbsp; vfieldlist :<span class="sy0">=</span> vfieldlist<span class="sy0">||</span>vcurfield<span class="sy0">.</span><span class="kw1">COLUMN</span><span class="sy0">||</span><span class="st0">&#8216;, &#8216;</span>;</p>
<p>&nbsp; &nbsp; <span class="co1">&#8211; If the history table exists and the number of fields is different</span><br />
&nbsp; &nbsp; <span class="co1">&#8211; from the main table (+3 as we add a timestamp, a user field and an history unique id)</span><br />
&nbsp; &nbsp; <span class="kw1">IF</span> vtmptablename <span class="kw1">IS</span> <span class="kw1">NOT</span> <span class="kw1">NULL</span> <span class="kw1">AND</span> vtablefieldcount<span class="sy0">+</span><span class="nu0">3</span> <span class="sy0">&lt;&gt;</span> vhistfieldcount<br />
&nbsp; &nbsp; THEN<br />
&nbsp; &nbsp; &nbsp; <span class="co1">&#8211; make sure that this is the missing field in the history table</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;PERFORM a<span class="sy0">.</span>attname<br />
&nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">FROM</span> pg_attribute <span class="kw1">AS</span> a<br />
&nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">INNER</span> <span class="kw1">JOIN</span> pg_class <span class="kw1">AS</span> c <span class="kw1">ON</span> <span class="br0">&#40;</span>c<span class="sy0">.</span>oid <span class="sy0">=</span> a<span class="sy0">.</span>attrelid<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">INNER</span> <span class="kw1">JOIN</span> pg_namespace <span class="kw1">AS</span> n <span class="kw1">ON</span> <span class="br0">&#40;</span>n<span class="sy0">.</span>oid <span class="sy0">=</span> c<span class="sy0">.</span>relnamespace<span class="br0">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">WHERE</span> a<span class="sy0">.</span>attnum <span class="sy0">&gt;</span> 0<br />
&nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">AND</span> c<span class="sy0">.</span>relname <span class="sy0">=</span> vhisttablename<br />
&nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">AND</span> <span class="kw1">NOT</span> a<span class="sy0">.</span>attisdropped<br />
&nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">AND</span> a<span class="sy0">.</span>attname <span class="sy0">=</span> vcurfield<span class="sy0">.</span><span class="kw1">COLUMN</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">AND</span> pg_table_is_visible<span class="br0">&#40;</span>c<span class="sy0">.</span>oid<span class="br0">&#41;</span>;</p>
<p>&nbsp; &nbsp; &nbsp; &nbsp;GET DIAGNOSTICS vtmprowcount <span class="sy0">=</span> ROW_COUNT;</p>
<p>&nbsp; &nbsp; &nbsp; <span class="co1">&#8211; If it is then generate the SQL and execute the alter table</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;<span class="kw1">IF</span> vtmprowcount <span class="sy0">=</span> <span class="nu0">0</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;THEN<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;vhisttablesql :<span class="sy0">=</span> <span class="st0">&#8216;ALTER TABLE &#8216;</span><span class="sy0">||</span>vhisttablename<span class="sy0">||</span><span class="st0">&#8216; ADD COLUMN &#8216;</span><span class="sy0">||</span>vcurfield<span class="sy0">.</span><span class="kw1">COLUMN</span><span class="sy0">||</span><span class="st0">&#8216; &#8216;</span><span class="sy0">||</span>vcurfield<span class="sy0">.</span>datatype<span class="sy0">||</span><span class="st0">&#8216; NULL;&#8217;</span>;<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;EXECUTE vhisttablesql;<br />
&nbsp; &nbsp; &nbsp; &nbsp;END <span class="kw1">IF</span>;<br />
&nbsp; &nbsp; &nbsp;END <span class="kw1">IF</span>;<br />
&nbsp; END LOOP;</p>
<p>&nbsp; <span class="co1">&#8211; The history table doesn&#8217;t exist. Create it</span><br />
&nbsp; <span class="kw1">IF</span> vtmptablename <span class="kw1">IS</span> <span class="kw1">NULL</span> <span class="kw1">OR</span> vtmptablename <span class="sy0">=</span> <span class="st0">&#8221;</span><br />
&nbsp; THEN<br />
&nbsp; &nbsp; vhisttablesql :<span class="sy0">=</span> <span class="st0">&#8216;CREATE TABLE &#8216;</span><span class="sy0">||</span>vhisttablename<span class="sy0">||</span><span class="st0">&#8216; ( histid SERIAL PRIMARY KEY, &#8216;</span>;</p>
<p>&nbsp; &nbsp; vhisttablesql :<span class="sy0">=</span> vhisttablesql<span class="sy0">||</span>vhisttablefields<span class="sy0">||</span><span class="st0">&#8216; history_creation TIMESTAMP NOT NULL DEFAULT now() );&#8217;</span>;</p>
<p>&nbsp; &nbsp; EXECUTE vhisttablesql;</p>
<p>&nbsp; &nbsp; <span class="co1">&#8211; create indexes</span><br />
&nbsp; &nbsp; vhisttablesql :<span class="sy0">=</span> <span class="st0">&#8216; CREATE INDEX idx_&#8217;</span><span class="sy0">||</span>vhisttablename<span class="sy0">||</span><span class="st0">&#8216;_1 ON &#8216;</span><span class="sy0">||</span>vhisttablename<span class="sy0">||</span><span class="st0">&#8216; (id); &#8216;</span>;<br />
&nbsp; &nbsp; EXECUTE vhisttablesql;</p>
<p>&nbsp; &nbsp; vhisttablesql :<span class="sy0">=</span> <span class="st0">&#8216; CREATE INDEX idx_&#8217;</span><span class="sy0">||</span>vhisttablename<span class="sy0">||</span><span class="st0">&#8216;_2 ON &#8216;</span><span class="sy0">||</span>vhisttablename<span class="sy0">||</span><span class="st0">&#8216; (history_creation); &#8216;</span>;<br />
&nbsp; &nbsp; EXECUTE vhisttablesql;<br />
&nbsp; END <span class="kw1">IF</span>;</p>
<p>&nbsp; <span class="co1">&#8211; Proceed with the history creation</span><br />
&nbsp; vhistinsertsql :<span class="sy0">=</span> <span class="st0">&#8216;INSERT INTO &#8216;</span><span class="sy0">||</span>vhisttablename<span class="sy0">||</span><span class="st0">&#8216; (&#8216;</span><span class="sy0">||</span>vfieldlist<span class="sy0">||</span><span class="st0">&#8216; history_creation) SELECT &#8216;</span><span class="sy0">||</span>vfieldlist<span class="sy0">||</span><span class="st0">&#8216; now() FROM &#8216;</span><span class="sy0">||</span>tablename<span class="sy0">||</span><span class="st0">&#8216; WHERE id=&#8217;</span><span class="sy0">||</span>recordid<span class="sy0">||</span><span class="st0">&#8216;;&#8217;</span>;</p>
<p>&nbsp; RAISE NOTICE <span class="st0">&#8216;Executing %&#8217;</span><span class="sy0">,</span> vhistinsertsql;<br />
&nbsp; EXECUTE vhistinsertsql;</p>
<p>&nbsp; GET DIAGNOSTICS vtmprowcount <span class="sy0">=</span> ROW_COUNT;<br />
&nbsp; <span class="kw1">IF</span> vtmprowcount <span class="sy0">&gt;</span> 0<br />
&nbsp; THEN<br />
&nbsp; &nbsp; <span class="kw1">RETURN</span> TRUE;<br />
&nbsp; ELSE<br />
&nbsp; &nbsp; <span class="kw1">RETURN</span> FALSE;<br />
&nbsp; END <span class="kw1">IF</span>;<br />
END;<br />
$$<br />
<span class="kw1">LANGUAGE</span> plpgsql;</div>
</div>
<p>You&#8217;ll notice this function only adds fields to the history table. That&#8217;s because even if we remove a field from the original table we still want to keep track of it in our history.</p>
<p>Now on to <strong>querying the history to retrieve the state of a record at a specific date</strong>.</p>
<p>This is a bit tricky. At the moment as far as I know only Oracle has the ability return a &#8220;virtual table&#8221; from a function. PostgreSQL can return a RECORD variable &#8211; which is great to use inside functions but once outside it loses its structure and turns into a comma separated list of values &#8211; or a %ROWTYPE you can define.<br />
This unfortunately means that you&#8217;d have to create a custom function for each history table to returns a SET OF client &#8211; in our case.</p>
<p>What we are trying to achieve is something like what this query does.</p>
<div class="codesnip-container" >
<div class="sql codesnip" style="font-family:monospace;"><span class="kw1">SELECT</span> h<span class="sy0">.</span>id<span class="sy0">,</span> firstname<span class="sy0">,</span> lastname<span class="sy0">,</span> datecreated<span class="sy0">,</span> hist_creation<br />
<span class="kw1">FROM</span> client_hist <span class="kw1">AS</span> h<br />
<span class="kw1">INNER</span> <span class="kw1">JOIN</span> <span class="br0">&#40;</span><br />
&nbsp; <span class="kw1">SELECT</span> min<span class="br0">&#40;</span>history_creation<span class="br0">&#41;</span> <span class="kw1">AS</span> mintstamp<span class="sy0">,</span> id <span class="kw1">AS</span> id<br />
&nbsp; <span class="kw1">FROM</span> client_hist<br />
&nbsp; <span class="kw1">WHERE</span> history_creation <span class="sy0">&gt;</span> THE<span class="sy0">-</span>TIME<span class="sy0">-</span>YOU<span class="sy0">-</span>NEED<br />
&nbsp; <span class="kw1">AND</span> id<span class="sy0">=</span> YOUR<span class="sy0">-</span>RECORD<span class="sy0">-</span>ID<br />
&nbsp; <span class="kw1">GROUP</span> <span class="kw1">BY</span> id<br />
<span class="br0">&#41;</span> <span class="kw1">AS</span> m <span class="kw1">ON</span> <span class="br0">&#40;</span>m<span class="sy0">.</span>id <span class="sy0">=</span> h<span class="sy0">.</span>id <span class="kw1">AND</span> m<span class="sy0">.</span>mintstamp <span class="sy0">=</span> h<span class="sy0">.</span>history_creation<span class="br0">&#41;</span>;</div>
</div>
<p>Basically the next record in the history after the time specified.</p>
<p>At this point so far as I can see we have two options. Either create a function that returns a specific type of ROW, or write a more generic history function to return only the id of the history record we need and not the whole row (If you look at the history creation function we are putting a histid column in there).</p>
<p>You could use this function in your queries to get the history record id out like this.</p>
<div class="codesnip-container" >
<div class="sql codesnip" style="font-family:monospace;"><span class="co1">&#8211; This function returns the histid you need to look at in your history table</span><br />
<span class="kw1">CREATE</span> <span class="kw1">OR</span> <span class="kw1">REPLACE</span> <span class="kw1">FUNCTION</span> select_hist_id<span class="br0">&#40;</span>tablename VARCHAR<span class="br0">&#40;</span>255<span class="br0">&#41;</span><span class="sy0">,</span> recordid INTEGER<span class="sy0">,</span> tstamp TIMESTAMP<span class="br0">&#41;</span> RETURNS INTEGER <span class="kw1">AS</span> $$<br />
DECLARE<br />
&nbsp; curs REFCURSOR;<br />
&nbsp; vid INTEGER;<br />
BEGIN<br />
&nbsp; OPEN curs <span class="kw1">FOR</span> EXECUTE <span class="st0">&#8216;SELECT h.histid<br />
&nbsp; FROM &#8216;</span><span class="sy0">||</span>tablename<span class="sy0">||</span><span class="st0">&#8216;_hist AS h <br />
&nbsp; INNER JOIN (<br />
&nbsp; &nbsp; SELECT min(history_creation) AS mintstamp, id <br />
&nbsp; &nbsp; FROM &#8216;</span><span class="sy0">||</span>tablename<span class="sy0">||</span><span class="st0">&#8216;_hist<br />
&nbsp; &nbsp; WHERE history_creation &gt; &#8216;</span><span class="st0">&#8221;</span><span class="sy0">||</span>tstamp<span class="sy0">||</span><span class="st0">&#8221;</span><span class="st0">&#8216;<br />
&nbsp; &nbsp; AND id=&#8217;</span><span class="sy0">||</span>recid<span class="sy0">||</span><span class="st0">&#8216;<br />
&nbsp; &nbsp; GROUP BY id<br />
&nbsp; ) AS m ON (m.id = h.id AND m.mintstamp = h.history_creation);&#8217;</span>;</p>
<p>&nbsp; FETCH curs <span class="kw1">INTO</span> vid;</p>
<p>&nbsp; <span class="kw1">RETURN</span> vid;<br />
END;<br />
$$<br />
<span class="kw1">LANGUAGE</span> plpgsql;</p>
<p><span class="co1">&#8211; At this point you can just get ids like this</span><br />
<span class="co1">&#8211; records in client as of this time 2009-11-02 21:13:41.552601</span><br />
<span class="kw1">SELECT</span> select_hist_id<span class="br0">&#40;</span><span class="st0">&#8216;client&#8217;</span><span class="sy0">,</span> id<span class="sy0">,</span> <span class="st0">&#8217;2009-11-02 21:13:41.552601&#8242;</span>::TIMESTAMP<span class="br0">&#41;</span> <span class="kw1">FROM</span> client;</div>
</div>
<p>I&#8217;m sure you can figure out the rest.</p>
<p>Be careful this is very slow on large data sets. If you are planning to work on millions of records then you should consider building a history lookup function for each table either defining data types for your RECORD or using OUT variables.</p>
<div class="codesnip-container" >
<div class="sql codesnip" style="font-family:monospace;"><span class="kw1">CREATE</span> <span class="kw1">FUNCTION</span> foo<span class="br0">&#40;</span>recordid int<span class="sy0">,</span> firstname OUT VARCHAR<span class="br0">&#40;</span><span class="nu0">10</span><span class="br0">&#41;</span><span class="sy0">&#8230;</span><span class="br0">&#41;</span></div>
</div>
<!-- Easy AdSense V2.82 -->
<!-- Post[count: 3] -->
<div class="ezAdsense adsense adsense-leadout" style="text-align:center;margin:12px;"><script type="text/javascript"><!--
google_ad_client = "pub-8456780651289352";
/* 468x60, created 11/24/09 */
google_ad_slot = "7140896000";
google_ad_width = 468;
google_ad_height = 60;
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script></div>
<!-- Easy AdSense V2.82 -->

<!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark It</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://buzz.yahoo.com/submit?submitUrl=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F&amp;submitHeadline=Tracking+database+records+history&amp;submitSummary=" rel="nofollow" title="Add to&nbsp;Buzz"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/buzz.png" title="Add to&nbsp;Buzz" alt="Add to&nbsp;Buzz" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F&amp;title=Tracking+database+records+history" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F&amp;title=Tracking+database+records+history" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.google.com/bookmarks/mark?op=edit&amp;output=popup&amp;bkmk=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F&amp;title=Tracking+database+records+history" rel="nofollow" title="Add to&nbsp;Google Bookmarks"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/google.png" title="Add to&nbsp;Google Bookmarks" alt="Add to&nbsp;Google Bookmarks" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.mister-wong.com/index.php?action=addurl&amp;bm_url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F&amp;bm_description=Tracking+database+records+history" rel="nofollow" title="Add to&nbsp;Mister Wong"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/misterwong.png" title="Add to&nbsp;Mister Wong" alt="Add to&nbsp;Mister Wong" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.netscape.com/submit/?U=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F&amp;T=Tracking+database+records+history" rel="nofollow" title="Add to&nbsp;Netscape"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/netscape.png" title="Add to&nbsp;Netscape" alt="Add to&nbsp;Netscape" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F&amp;title=Tracking+database+records+history" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F&amp;title=Tracking+database+records+history" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.technorati.com/faves?add=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F" rel="nofollow" title="Add to&nbsp;Technorati"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/technorati.png" title="Add to&nbsp;Technorati" alt="Add to&nbsp;Technorati" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://tipd.com/submit.php?url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F" rel="nofollow" title="Add to&nbsp;Tip'd"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/tipd.png" title="Add to&nbsp;Tip'd" alt="Add to&nbsp;Tip'd" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Tracking+database+records+history+@+http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://myweb2.search.yahoo.com/myresults/bookmarklet?u=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Ftracking-database-records-history%2F&amp;t=Tracking+database+records+history" rel="nofollow" title="Add to&nbsp;Yahoo My Web"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/yahoo.png" title="Add to&nbsp;Yahoo My Web" alt="Add to&nbsp;Yahoo My Web" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://sapessi.com/2009/11/tracking-database-records-history/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Keeping on top of your data</title>
		<link>http://sapessi.com/2009/11/keeping-on-top-of-your-data/</link>
		<comments>http://sapessi.com/2009/11/keeping-on-top-of-your-data/#comments</comments>
		<pubDate>Sun, 01 Nov 2009 14:58:20 +0000</pubDate>
		<dc:creator>Stefano Buliani</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Analyze]]></category>
		<category><![CDATA[Data]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[Explain]]></category>
		<category><![CDATA[Optimisation]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[Query]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Vacuum]]></category>

		<guid isPermaLink="false">http://sapessi.com/?p=276</guid>
		<description><![CDATA[Not just because it&#8217;s vital for your business. For most production systems the speed-bottleneck lies with accessing your data. Database are excellent for storing all your data and keeping it organised. However, when it comes to getting it out quickly, especially if you have lots of it, they are not the sharpest of tools. That&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F&amp;source=sapessi&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Not just because it&#8217;s vital for your business.<br />
For most production systems the speed-bottleneck lies with accessing your data. Database are excellent for storing all your data and keeping it organised. However, when it comes to getting it out quickly, especially if you have lots of it, they are not the sharpest of tools.</p>
<p>That&#8217;s why I always try to press home the importance of keeping on top of your data.</p>
<p>Even though you think all your data is neatly organised in your perfectly structured database it keeps changing shape, or rather the understanding your database has of your data keeps changing.<br />
For example in <a href="http://www.postgresql.org" target="_blank">PostgreSQL</a> the &#8220;shape&#8221; of your data is stored in a table called pg_statistics. Data is collected and stored there by <a href="http://www.postgresql.org/docs/8.1/static/sql-analyze.html" target="_blank">analyze</a>.</p>
<p>The query planner uses the data collected in pg_statistics to pick the most efficient way to run your queries. Unfortunately no system is perfect, even the best planner makes mistakes. You have to strike a balance between letting analyze collect as much data as possible to give your database a better understanding of your data and keeping it slim enough for it to be quick.</p>
<p>So as much as you can trust machines I suggest you try to keep on top of your data yourself.</p>
<p>There&#8217;s a couple of very simple ways to do that.</p>
<p>First. Keep an eye on your database logs, exactly like you do with your webserver logs. There&#8217;s a few open-source applications that can help you do that like <a href="http://epqa.sourceforge.net/" target="_blank">Enterprise Postgres Query Analyser</a>.<br />
This will give you a basic understanding of which query/ies you will have to focus on.</p>
<p>Once you have an idea of which ones are the slowest queries you should keep an eye on the explain plan at regular intervals. In PostgreSQL I have a scheduled job that every day runs &#8220;EXPLAIN ANALYZE&#8221; on the heaviest queries in my system and compares the output with the previous day&#8217;s.</p>
<p>It&#8217;s a lot of work and you are probably better off confronting these problems as they come up and not waste valuable development time creating the most optimised database ever.<br />
Setting up <a href="http://www.postgresql.org/docs/current/static/routine-vacuuming.html#AUTOVACUUM" target="_blank">auto-vacuuming</a> properly will keep you safe for a long while.</p>
<!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark It</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://buzz.yahoo.com/submit?submitUrl=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F&amp;submitHeadline=Keeping+on+top+of+your+data&amp;submitSummary=" rel="nofollow" title="Add to&nbsp;Buzz"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/buzz.png" title="Add to&nbsp;Buzz" alt="Add to&nbsp;Buzz" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F&amp;title=Keeping+on+top+of+your+data" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F&amp;title=Keeping+on+top+of+your+data" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.google.com/bookmarks/mark?op=edit&amp;output=popup&amp;bkmk=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F&amp;title=Keeping+on+top+of+your+data" rel="nofollow" title="Add to&nbsp;Google Bookmarks"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/google.png" title="Add to&nbsp;Google Bookmarks" alt="Add to&nbsp;Google Bookmarks" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.mister-wong.com/index.php?action=addurl&amp;bm_url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F&amp;bm_description=Keeping+on+top+of+your+data" rel="nofollow" title="Add to&nbsp;Mister Wong"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/misterwong.png" title="Add to&nbsp;Mister Wong" alt="Add to&nbsp;Mister Wong" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.netscape.com/submit/?U=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F&amp;T=Keeping+on+top+of+your+data" rel="nofollow" title="Add to&nbsp;Netscape"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/netscape.png" title="Add to&nbsp;Netscape" alt="Add to&nbsp;Netscape" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F&amp;title=Keeping+on+top+of+your+data" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F&amp;title=Keeping+on+top+of+your+data" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.technorati.com/faves?add=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F" rel="nofollow" title="Add to&nbsp;Technorati"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/technorati.png" title="Add to&nbsp;Technorati" alt="Add to&nbsp;Technorati" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://tipd.com/submit.php?url=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F" rel="nofollow" title="Add to&nbsp;Tip'd"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/tipd.png" title="Add to&nbsp;Tip'd" alt="Add to&nbsp;Tip'd" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Keeping+on+top+of+your+data+@+http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://myweb2.search.yahoo.com/myresults/bookmarklet?u=http%3A%2F%2Fsapessi.com%2F2009%2F11%2Fkeeping-on-top-of-your-data%2F&amp;t=Keeping+on+top+of+your+data" rel="nofollow" title="Add to&nbsp;Yahoo My Web"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/yahoo.png" title="Add to&nbsp;Yahoo My Web" alt="Add to&nbsp;Yahoo My Web" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://sapessi.com/2009/11/keeping-on-top-of-your-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Recursive queries are evil</title>
		<link>http://sapessi.com/2009/10/recursive-queries-are-evil/</link>
		<comments>http://sapessi.com/2009/10/recursive-queries-are-evil/#comments</comments>
		<pubDate>Fri, 30 Oct 2009 14:17:44 +0000</pubDate>
		<dc:creator>Stefano Buliani</dc:creator>
				<category><![CDATA[Technology]]></category>
		<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[Interview]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[Query]]></category>
		<category><![CDATA[Recursive]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Structure]]></category>
		<category><![CDATA[Tree]]></category>

		<guid isPermaLink="false">http://sapessi.com/?p=247</guid>
		<description><![CDATA[Maybe that&#8217;s a bit too harsh, maybe recursive query are not evil, it&#8217;s just the people who use them. I spend quite a lot of time working with PostgreSQL users helping them optimise their queries. When I read that PostgreSQL 8.4 added support for recursive queries I knew that a whole new hellish chapter in [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F&amp;source=sapessi&amp;style=normal&amp;service=bit.ly&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<p>Maybe that&#8217;s a bit too harsh, maybe recursive query are not evil, it&#8217;s just the people who use them.</p>
<p><span style="background-color: #ffffff;"><a href="http://www.postgresql.org"><img class="alignleft size-full wp-image-250" title="PostgreSQL Logo" src="http://sapessi.com/wp-content/uploads/2009/10/postgresql.png" alt="PostgreSQL Logo" width="150" height="150" /></a>I spend quite a lot of time working with <a href="http://www.postgresql.org" target="_blank">PostgreSQL</a> users helping them optimise their queries. When I read that <a href="http://www.postgresql.org/docs/8.4/static/release-8-4-1.html" target="_blank">PostgreSQL 8.4 added support for recursive queries</a> I knew that a whole new hellish chapter in my life would begin.</span></p>
<p><span style="background-color: #ffffff;">First off. What are recursive queries: (from <a href="http://www.postgresql.org/docs/8.4/static/queries-with.html" target="_blank">PostgreSQL manual</a>)</span></p>
<blockquote><p>Recursive queries are typically used to deal with hierarchical or tree-structured data. A useful example is this query to find all the direct and indirect sub-parts of a product, given only a table that shows immediate inclusions:</p>
<p><span style="background-color: #ffffff;">WITH RECURSIVE included_parts(sub_part, part, quantity) AS (<br />
SELECT sub_part, part, quantity FROM parts WHERE part = &#8216;our_product&#8217;<br />
UNION ALL<br />
SELECT p.sub_part, p.part, p.quantity<br />
FROM included_parts pr, parts p<br />
WHERE p.part = pr.sub_part<br />
)<br />
SELECT sub_part, SUM(quantity) as total_quantity<br />
FROM included_parts<br />
GROUP BY sub_part</span></p></blockquote>
<p><span style="background-color: #ffffff;">These structures are commonly used in relational database. Just think about a threaded comment system for a blog or an industry classification for securities on multiple levels (financial data is what I&#8217;m most familiar with).</span></p>
<p><span style="background-color: #ffffff;">In this latter case you can image that you&#8217;ll hardly ever extract industry classification information by itself. It&#8217;s generally used as a sub-query to provide additional information about a security, a trade or what have you.</span></p>
<p>As I said earlier I have nothing against recursive queries per se. However, I can already see people out there creating monster-queries in production systems. The sort of monster query that needs to be executed 50 times a second, the one that just doesn&#8217;t work.</p>
<p><span style="background-color: #ffffff;">Storing and retrieving tree-structured data in SQL is one of my favourite questions in interviews. I always make a point of asking it. Not because it&#8217;s particularly challenging technically but because it will tell me a lot about the way the person I&#8217;m interviewing thinks about data.</span></p>
<p><span style="background-color: #ffffff;">The first part of the question is obviously do design a structure to hold threaded blog comments.</span></p>
<p><span style="background-color: #ffffff;">Whether you use a separate table to hold the relationship between nodes or a self-referencing parent id column in the same table I don&#8217;t really care. So long as you come up with an answer we can move on with the interview, because the answer to the next part of the question is what interests me.</span></p>
<p><span style="background-color: #ffffff;">I will now call your blog page with the ID from an element in your structure, any element. I want you to return instantly the ID of the root element for that branch of the tree.</span></p>
<p><span style="background-color: #ffffff;"> </span></p>
<pre>- Root comment 1
   - Child 1.1
   - Child 1.2
      - Child 1.2.1
   - Child 1.3
- Root comment 2
   - Child 2.1
      - Child 2.1.1
         - Child 2.1.1.1
   - Child 2.2</pre>
<p><span style="background-color: #ffffff;">I will call you with 2.1.1.1 and I want you to tell me 2, instantly. Feel free to change your database structure.</span></p>
<p><span style="background-color: #ffffff;">Their answer to this will tell me how they feel about de-normalisation and if they can think in those terms. We are talking about the daft requirements written by a product person who&#8217;s clearly gone quite mad. All he cares about is getting the data out quickly, nothing else.</span></p>
<p><span style="background-color: #ffffff;">Easiest de-normalised way out is to add a root id column in each comment row. It will make inserting new comments slower but it won&#8217;t require any recursion to go back to the top when selecting data.</span></p>
<p>If all you can come up with is recursive query I&#8217;ll be sorely disappointed. It&#8217;s cool and elegant but not nearly efficient enough for a high-availability production system.</p>
<p><span style="background-color: #ffffff;">Feel free to talk about recursive queries when I ask you this question, just remember to put the magic words &#8220;materialized view&#8221; in front of it. then we can talk.</span></p>
<p><span style="background-color: #ffffff;"><strong>Let this bet a warning to you. If I find a non-materialized/cached recursive query in your production code I will recursively kick you in the head.</strong></span></p>
<!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark It</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://buzz.yahoo.com/submit?submitUrl=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F&amp;submitHeadline=Recursive+queries+are+evil&amp;submitSummary=" rel="nofollow" title="Add to&nbsp;Buzz"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/buzz.png" title="Add to&nbsp;Buzz" alt="Add to&nbsp;Buzz" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F&amp;title=Recursive+queries+are+evil" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F&amp;title=Recursive+queries+are+evil" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.google.com/bookmarks/mark?op=edit&amp;output=popup&amp;bkmk=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F&amp;title=Recursive+queries+are+evil" rel="nofollow" title="Add to&nbsp;Google Bookmarks"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/google.png" title="Add to&nbsp;Google Bookmarks" alt="Add to&nbsp;Google Bookmarks" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.mister-wong.com/index.php?action=addurl&amp;bm_url=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F&amp;bm_description=Recursive+queries+are+evil" rel="nofollow" title="Add to&nbsp;Mister Wong"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/misterwong.png" title="Add to&nbsp;Mister Wong" alt="Add to&nbsp;Mister Wong" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.netscape.com/submit/?U=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F&amp;T=Recursive+queries+are+evil" rel="nofollow" title="Add to&nbsp;Netscape"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/netscape.png" title="Add to&nbsp;Netscape" alt="Add to&nbsp;Netscape" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F&amp;title=Recursive+queries+are+evil" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F&amp;title=Recursive+queries+are+evil" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.technorati.com/faves?add=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F" rel="nofollow" title="Add to&nbsp;Technorati"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/technorati.png" title="Add to&nbsp;Technorati" alt="Add to&nbsp;Technorati" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://tipd.com/submit.php?url=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F" rel="nofollow" title="Add to&nbsp;Tip'd"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/tipd.png" title="Add to&nbsp;Tip'd" alt="Add to&nbsp;Tip'd" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Recursive+queries+are+evil+@+http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://myweb2.search.yahoo.com/myresults/bookmarklet?u=http%3A%2F%2Fsapessi.com%2F2009%2F10%2Frecursive-queries-are-evil%2F&amp;t=Recursive+queries+are+evil" rel="nofollow" title="Add to&nbsp;Yahoo My Web"><img class="social_img" src="http://sapessi.com/wp-content/plugins/social-bookmarks/images/yahoo.png" title="Add to&nbsp;Yahoo My Web" alt="Add to&nbsp;Yahoo My Web" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://sapessi.com/2009/10/recursive-queries-are-evil/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

