Joining a MySQL table against itself (self-joining MySQL tables)So I had an interesting MySQL problem this weekend -- I have a table: mysql> desc ldaplog; +------------+------------------+------+-----+---------------------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------+------------------+------+-----+---------------------+----------------+ | ldaplog_id | int(10) unsigned | | MUL | NULL | auto_increment | | detected | datetime | | MUL | 0000-00-00 00:00:00 | | | uid | varchar(255) | | MUL | | | | event | varchar(255) | | MUL | | | | attr | varchar(255) | YES | MUL | NULL | | | value | text | YES | | NULL | | +------------+------------------+------+-----+---------------------+----------------+ It's a log table: every time a change gets detected in LDAP, a row is generated. 'Event' is one of: mysql> select count(*), event from ldaplog group by event; +----------+--------------+ | count(*) | event | +----------+--------------+ | 3389 | ATTR_ADDED | | 1 | ATTR_DELETED | | 20 | ATTR_UPDATED | | 357 | UID_ADDED | | 25 | UID_DELETED | +----------+--------------+ And your typical set of rows associated with a single UID looks lile: mysql> select * from ldaplog where uid = 'mccreavy'; +------------+---------------------+-----------+------------+-----------------+-----------------------+ | ldaplog_id | detected | uid | event | attr | value | +------------+---------------------+-----------+------------+-----------------+-----------------------+ | 2181 | 2008-09-28 00:00:00 | mccreavy | UID_ADDED | NULL | NULL | | 2191 | 2008-09-28 00:00:00 | mccreavy | ATTR_ADDED | uid | mccreavy | | 2183 | 2008-09-28 00:00:00 | mccreavy | ATTR_ADDED | givenname | Michael | | 2183 | 2008-09-28 00:00:00 | mccreavy | ATTR_ADDED | lastname | McCreavy | | 2184 | 2008-09-28 00:00:00 | mccreavy | ATTR_ADDED | l | Santa Clara, CA | | 2185 | 2008-09-28 00:00:00 | mccreavy | ATTR_ADDED | mail | mike@mccreavy.com | | 2186 | 2008-09-28 00:00:00 | mccreavy | ATTR_ADDED | mobile | 650-274-6633 | +------------+---------------------+-----------+------------+-----------------+-----------------------+ If a UID gets deleted from LDAP, a row looking like this will get logged: mysql> select * from ldaplog where uid = 'heinhold' and event = 'UID_DELETED'; +------------+---------------------+----------+-------------+------+-------+ | ldaplog_id | detected | uid | event | attr | value | +------------+---------------------+----------+-------------+------+-------+ | 4025 | 2008-12-02 00:00:00 | heinhold | UID_DELETED | NULL | NULL | +------------+---------------------+----------+-------------+------+-------+ And I was wondering if I could find out two things:
I guess the first one would be kind of easy with a subselect to just get a list of deleted UIDs: select * from ldaplog where uid in (select uid from ldaplog where event = 'UID_DELETED');but that wouldn't include the date of the UIDs deletion. So if I really needed the creation and deletion date of the UID in a single query, I'd have to get fancy with some self-joining. Self-joining is where you join a table against itself. I should explain it with some more simple examples, but I'll just move on instead. First, the base query: get the "UID Addred" rows: select A.uid, A.event, A.detected from ldaplog as A where A.event = 'UID_ADDED'; Second, join the table against itself: select A.uid, A.event, A.detected, B.event, B.detected from ldaplog as A left join ldaplog as B using (uid) where A.event = 'UID_ADDED' and B.event = 'UID_DELETED'; This is the result set I wanted. It just shows the added UIDs that have been deleted. If the table was logically inconsistent and there was a "UID_DELETED" row with no corresponding "UID_ADDED", that row would not be selected... This is because the base query started with a result set of UIDs that have been added. Now that I know the list of deleted UIDs, I can use that as a subselect to find the active UIDs:
select uid, detected from ldaplog
where event = 'UID_ADDED'
and uid not in
(select A.uid from ldaplog as A
left join ldaplog as B
using (uid)
where A.event = 'UID_ADDED' and B.event = 'UID_DELETED');
Or even just a count of active UIDs:
select count(*) from ldaplog
where event = 'UID_ADDED'
and uid not in
(select A.uid from ldaplog as A
left join ldaplog as B
using (uid)
where A.event = 'UID_ADDED' and B.event = 'UID_DELETED');
I bet there are other ways to get these result sets, but this is how I did it. |
|