Hyper Estraier vs Xapian
Hyper Estraier and Xapian are two popular choices for open source full text search engines that can do live index updates.
Contents |
[edit] History
Xapian was built to be an open source replacement of the Muscat:
Xapian is partly derived from the Open Muscat engine, developed by BrightStation PLC and released under the GPL. Open Muscat was built to be a replacement for the proprietary Muscat 3.6 information retrieval system, which was written almost entirely in BCPL, and becoming hard to extend in the ways they wanted.
— The Xapian Project , History
Hyper Estraier was written and is maintained by Mikio Hirabayashi. It was created to be a successor to Estraier.
[edit] Portability
Both Hyper Estraier and Xapian are highly portable. They can both be installed on Linux, Unix, Windows, and Macs.
[edit] Code
Hyper Estraier is written in C and Xapian is written in C++.
[edit] Bindings
Both search engines provide bindings for various languages and RDBMSs.
Hyper Estraier provides bindings as well as pure implementations in Perl, Java, and Ruby. Hyper Estraier supports Python through SWIG.
Xapian Perl bindings are available in the module Search::Xapian on CPAN. Java JNI bindings are included in the xapian-bindings module. Xapian also supports SWIG which can generate bindings for many languages. At present those for Python, PHP, TCL, C#, and Ruby are working.
[edit] Features
| Feature | Hyper Estraier | Xapian |
|---|---|---|
| Unicode Support | Yes | Yes (including codepoints beyond the BMP) |
| Ranked Probabilistic Search | No | Yes |
| Relevance Feedback | No | Yes |
| Phrase and proximity searching | Yes | Yes |
| boolean search operators | Yes | Yes |
| Attribute Search | Yes | No |
| Perfect recall | Yes | No |
| Stemming | No | Yes |
| Synonyms | Yes | Yes |
| Regular Expressions | Yes | No |
| Wild Cards | Yes (Through Regular Expressions) | Yes |
| Spelling Corrections | No | Yes |
| Live Index Updates | Yes | Yes |
| Reading of Various File Formats | No | Yes |
| P2P Architecture | Yes | No |
[edit] License
Hyper Estraier is licensed under LGPL and Xapian is licensed under GPL.
[edit] Performance
Hyper Estraier claims to have high performance searching.