Fixing a corrupt sqlite3 database with vbindiff

A few days ago my RSS reader, QuiteRSS, started crashing immediately on startup. After digging around a bit in ~/.local/share/QuiteRss/QuiteRss I discovered that a corrupted feeds.db file was the cause. This file is an sqlite3 database, which QuiteRSS uses to keep track of feeds, which posts the user has read, filters and so on. Obviously, if that file is broken then I'd have to go through the process of re-adding all 182 feeds I'm currently subscribed to. Annoying to say the least.

Comparing the corrupt feeds.db with the database from a fresh installation of QuiteRSS, using vbindiff, revealed the following:

feeds-corrupt.db                                                                
0000 0000: 01 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  ........ ........  
0000 0010: 01 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  ........ ........  
0000 0020: 01 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  ........ ........  
0000 0030: 01 00 00 00 00 00 00 00  00 00 00 01 00 00 00 00  ........ ........  
0000 0040: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 0050: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 27 36  ........ ......'6  
0000 0060: 00 2E 30 3A 0D 00 00 00  0C 00 D5 00 08 B5 05 E8  ..0:.... ........  
0000 0070: 05 AD 05 33 04 AB 04 0E  03 67 02 DC 02 61 01 B7  ...3.... .g...a..  
0000 0080: 01 32 00 D5 00 00 00 00  00 00 00 00 00 00 00 00  .2...... ........  
0000 0090: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 00A0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 00B0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 00C0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 00D0: 00 00 00 00 00 5B 0C 07  17 15 15 01 81 19 74 61  .....[.. ......ta  
0000 00E0: 62 6C 65 69 6E 66 6F 69  6E 66 6F 0D 43 52 45 41  bleinfoi nfo.CREA  
0000 00F0: 54 45 20 54 41 42 4C 45  20 69 6E 66 6F 28 69 64  TE TABLE  info(id  
feeds-fresh.db                                                                  
0000 0000: 53 51 4C 69 74 65 20 66  6F 72 6D 61 74 20 33 00  SQLite f ormat 3.  
0000 0010: 10 00 01 01 00 40 20 20  00 00 00 11 00 00 00 0D  .....@   ........  
0000 0020: 00 00 00 00 00 00 00 00  00 00 00 0E 00 00 00 04  ........ ........  
0000 0030: 00 00 00 00 00 00 00 00  00 00 00 01 00 00 00 00  ........ ........  
0000 0040: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 0050: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 11  ........ ........  
0000 0060: 00 2E 30 3A 0D 00 00 00  0C 00 D5 00 08 B5 05 E8  ..0:.... ........  
0000 0070: 05 AD 05 33 04 AB 04 0E  03 67 02 DC 02 61 01 B7  ...3.... .g...a..  
0000 0080: 01 32 00 D5 00 00 00 00  00 00 00 00 00 00 00 00  .2...... ........  
0000 0090: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 00A0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 00B0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 00C0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 00D0: 00 00 00 00 00 5B 0C 07  17 15 15 01 81 19 74 61  .....[.. ......ta  
0000 00E0: 62 6C 65 69 6E 66 6F 69  6E 66 6F 0D 43 52 45 41  bleinfoi nfo.CREA  
0000 00F0: 54 45 20 54 41 42 4C 45  20 69 6E 66 6F 28 69 64  TE TABLE  info(id  
┌──────────────────────────────────────────────────────────────────────────────┐
Arrow keys move  F find      RET next difference  ESC quit  T move top        
C ASCII/EBCDIC   E edit file   G goto position      Q quit  B move bottom     
└──────────────────────────────────────────────────────────────────────────────┘

The difference that stands out most is that the magic string "SQLite format 3" is missing from the corrupt file. Instead the first 56 bytes of the file appear to have been replaced with seven 64-bit little-endian integers with the value 1. Why? I have no idea. Perhaps some kind of heap corruption?

My first attempt at fixing this was to just replace the first 100 bytes of the file with data identical to the fresh file, which is quite easy to do with the E edit file command in vbindiff. This worked well enough that I was able to open the file in sqlitebrowser and see that the database schema and a few tables parsed OK:

Screenshot of sqlitebrowser

Sadly, this wasn't enough - the feeds table was empty.

After diggin around in the source code for sqlite3 for a while (mainly btreeInt.h, and lockBtree() in btree.c), I discovered that the reason for the feeds table not showing was because the nPage value in the header (bytes 28..31) was wrong. Its value specifies the size of the file in clusters, where each cluster is 4 KiB. So I took the size of the file (18415616 B), divided it by 4096 (4496), converted that to hexadecimal (0x1190) and put that hex value into the nPage part of the header:

feeds-almost.db                                                                 
0000 0000: 53 51 4C 69 74 65 20 66  6F 72 6D 61 74 20 33 00  SQLite f ormat 3.  
0000 0010: 10 00 01 01 00 40 20 20  00 00 00 11 00 00 00 0D  .....@   ........  
0000 0020: 00 00 00 00 00 00 00 00  00 00 00 0E 00 00 00 04  ........ ........  
0000 0030: 00 00 00 00 00 00 00 00  00 00 00 01 00 00 00 00  ........ ........  
0000 0040: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 0050: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 11  ........ ........  
0000 0060: 00 2E 30 3A 0D 00 00 00  0C 00 D5 00 08 B5 05 E8  ..0:.... ........  
0000 0070: 05 AD 05 33 04 AB 04 0E  03 67 02 DC 02 61 01 B7  ...3.... .g...a..  
0000 0080: 01 32 00 D5 00 00 00 00  00 00 00 00 00 00 00 00  .2...... ........  
0000 0090: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 00A0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 00B0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
feeds-npage.db                                                                  
0000 0000: 53 51 4C 69 74 65 20 66  6F 72 6D 61 74 20 33 00  SQLite f ormat 3.  
0000 0010: 10 00 01 01 00 40 20 20  00 00 00 11 00 00 11 90  .....@   ........  
0000 0020: 00 00 00 00 00 00 00 00  00 00 00 0E 00 00 00 04  ........ ........  
0000 0030: 00 00 00 00 00 00 00 00  00 00 00 01 00 00 00 00  ........ ........  
0000 0040: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 0050: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 11  ........ ........  
0000 0060: 00 2E 30 3A 0D 00 00 00  0C 00 D5 00 08 B5 05 E8  ..0:.... ........  
0000 0070: 05 AD 05 33 04 AB 04 0E  03 67 02 DC 02 61 01 B7  ...3.... .g...a..  
0000 0080: 01 32 00 D5 00 00 00 00  00 00 00 00 00 00 00 00  .2...... ........  
0000 0090: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 00A0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
0000 00B0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ........ ........  
┌──────────────────────────────────────────────────────────────────────────────┐
Arrow keys move  F find      RET next difference  ESC quit  T move top        
C ASCII/EBCDIC   E edit file   G goto position      Q quit  B move bottom     
└──────────────────────────────────────────────────────────────────────────────┘

And the result?

QuiteRSS working

It worked! 🕺

At this point I exported all my feeds as an OPML file, to have it a friendlier, interchangable XML form. The lesson from all this? I need to improve my backup system.. But on the plus side I learned quite a bit about sqlite3's internals!