Recoll
Jump to navigation
Jump to search
Recoll is a very very fast and lightweight file indexer.
Overview
Favorite of TuxRadar.com's review:
- Index can be build manually.
- Additional filters can be installed to support additional file types, support multiple indexes.
Personal review:
- (+) Quite fast query results
- (+) Can index IMAP Maildir directly, even attachments!
- (+) also return files matching similar keywords (stemming).
- (-) Indexing hangs on some .zip files (with 5G+ memory, and lot of swapping). Removed *.jar, *.zip, *.tgz, *.gz, *.tar from list of files to index.
- (-) Relevance low. Aggravated with the point above. Files with search keywords in the filename, or title, or at beginning of the file are not returned first. Stemming makes it even worse.
- Fix: From the documentation, stemming can be disabled by keyword if they are capitalized (like floor will find flooring, floored, but Floor will only return floor). Stemming can also be disabled from the menu.
- (-) No way to narrow down the results by PDF only or DOC only (both is considered text).
- Fix: Make a new query and add
ext:pdf
to restrict to PDF's.
- Fix: Make a new query and add
Install
Ubuntu
The version in Ubuntu universe repository is an old version. Install back-port repositories from launchpad.net as follows::
# This is not necessary anymore...
# gpg --keyserver keyserver.ubuntu.com --recv 9DA85604
# gpg --export --armor 9DA85604 | sudo apt-key add -
# gpg --keyserver keyserver.ubuntu.com --recv A0735AD0
# gpg --export --armor A0735AD0 | sudo apt-key add -
sudo add-apt-repository ppa:xapian-backports/ppa
# Old repository
# sudo add-apt-repository ppa:recoll-backports/ppa
# New repository
sudo add-apt-repository ppa:recoll-backports/recoll-1.15-on
sudo apt-get update
sudo apt-get install recoll
# Install recommended packages
sudo apt-get install antiword catdoc ghostscript libimage-exiftool-perl poppler-utils \
unrtf python-mutagen xsltproc untex pstotext python-chm
Debian
# Get signature key
gpg --keyserver hkp://pool.sks-keyservers.net:80 --recv-key 7808CE96D38B9201
gpg --export '7808CE96D38B9201' | sudo apt-key add -
# Add repository
cat << EOF | sudo tee /etc/apt/sources.list.d/recoll.list
deb http://www.lesbonscomptes.com/recoll/debian/ buster main
deb-src http://www.lesbonscomptes.com/recoll/debian/ buster main
EOF
# Update and install
sudo apt update
sudo apt install recoll
# Install recommended packages
sudo apt-get install antiword catdoc ghostscript libimage-exiftool-perl poppler-utils \
unrtf python-mutagen xsltproc untex pstotext python-chm
Usage
Some examples of queries (see [1]):
"foo bar" foo bar ext:pdf oracle filename:CRYPTO oracle filename:*CRYPTO* # idem filename:photo size>1M author:"john doe" Beatles OR Lennon Live OR Unplugged -potatoes dir:recoll dir:src -dir:utils -dir:common dir:recoll OR dir:src