Project

General

Profile

Bug #191

Search is slow when Lucene writes to disk

Added by Paula Gearon over 14 years ago.

Status:
New
Priority:
High
Assignee:
Category:
Mulgara
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Resolution:

Description

Often, if a search is exceptionally slow, much of that time is wasted when Lucene decides that its results need to be sorted. If the quantity of data to be sorted does not fit into memory, then Lucene swaps it out to disk, radically slowing what would otherwise be a fast set of comparisons.

Most of this sorting is useless and can occur anywhere in the search process (e.g., when combining two datasets in the midst of a larger query, or sorting the entire result set before returning it).

Tuning Lucene (telling it to sort only when necessary) and/or changing the OQL used in searches may vastly improve response times in some cases.

Example: on a full corpus, enable date searching (currently disabled in SearchAction.java lines 155 to 164 due to excessive query slowness) and search PLoS One for:

gene AND date:[2008-05-16 TO 2009-01-26]

No data to display

Also available in: Atom PDF