Project

General

Profile

Bug #8

Temp directory management

Added by brian - about 16 years ago. Updated almost 16 years ago.

Status:
New
Priority:
Normal
Assignee:
Category:
Mulgara
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Resolution:

Description

A research group has reported seeing Kowari fill up temp directories and fall over. This was on Solaris, but it might be a more general problem to solve. Nothing to reproduce it yet, but I just wanted to capture the experience to potentially investigate this issue moving forward.
<br/>

<br/>
The usage pattern was one big load and then mostly queries with the occasional insert.
#1

Updated by Paula Gearon about 16 years ago

Queries create temporary files when constraint resolutions get too large to manage in memory.  The result of a query may be small, but the results of individual constraints can be quite large, particularly when a lot of data has been loaded.  So it may be these constraint resolution files at fault.
<br/>

<br/>
It would be worth testing if these files are being removed in a timely manner.  I suggest adding an environment variable to override the high watermark for in-memory processing to a lower level, and then resolve several constraints which exceed this level.
<br/>

<br/>
The other thing to consider is how these files are being accessed.  If they are managed through normal I/O calls, then the close() on the file should delete it (if if was opened as a &quot;temporary&quot; file).  However, if they are being memory mapped, then we need to ensure that all references to the mapping are set to null.  We can even use a System.gc() loop if we really need to make sure the file has gone (but that should be done as a last resort).
#2

Updated by brian - about 16 years ago

It wouldn't have been the case on Solaris, but I know that there are problems with temp files not be deleted on Windows until the VM exits without a fair amount of nonsense. We should probably build some temp dir management tests into the suite (or extend any that are there) so we can easily catch these issues.
#3

Updated by Andrae Muys - almost 16 years ago

Some detailed bug reports would be useful, including the queries causing the trouble, and the count() attached to individual constraints.
<br/>

<br/>
There are some places where we might be performing distinct() or sort() more agressively then we strictly need to - these being the operations that generate temporary files.  However most of these calls are no-ops as once an intermediate result is sorted, subsequent calls to sort/distinct involving that data tend to become no-ops.
<br/>

<br/>
To track this down, enable logging in [[HybridTuples]] and log both the file and it's provenance for each instance (watch for clones, naturally they share the same file).

Also available in: Atom PDF