[LV2] Storing binary data chunks without duplication

Stefan Westerfeld stefan at space.twc.de
Sat May 11 09:39:18 PDT 2019


   Hi!

On Thu, May 09, 2019 at 06:52:17PM +0200, Robin Gareus wrote:
> I can't offer a solution, but perhaps a hint:
> 
> On 5/9/19 5:37 PM, Stefan Westerfeld wrote:
> > Ardour stores a new copy of each instrument chunk each time the user saves the session
> 
> When lv2_state_equal() matches the previously saved state, Ardour does
> not save the state.
> 
> The problem however is that the plugin is asked to save state
>   lilv_state_new_from_instance()
> this is compared: lv2_state_equal() and only on mismatch saved:
>   lilv_state_save().
> 
> I guess the problem in your case is that new_from_instance() already
> triggers saving the big external blob?

No, that is not really my problem. In the simplest case, my plugin state
consists of

 - general parameters (like number of unison voices or adsr params)
 - instrument chunk (samples + meta-data like loop points)

The general parameter block is small (a few kilobytes), the instrument
chunk can be large (a few megabytes).

General parameters change fairly often, wheras the instrument chunk is often
identical for each save. My problem is that the instrument chunk is duplicated
on disk if the user saves the session and at least one general parameter was
modified. So Ardour, the way it currently works, handles stuff correctly
detects that the state was changed, and correctly duplicates every part ot the
state. But this means I cannot really store things like that, because it wastes
too much space.

As I understand you, that is the way state:makePath is supposed to work, so
you cannot re-use data already written in the last iteration for the next
plugin save.


So what would be a great way to do it, to avoid the problem? I think what would
be needed is an extension during save() that works somewhat this:

  char *map_chunk (const char *path, size_t length, unsigned char *chunk);

Calling map_chunk for the plugin would be similar to

 - create path using state:makePath
 - populate the file with length bytes contained in the bytes in chunk
 - getting the abstract path of this file

Reading out the file would be done the same way as state:makePath.

So far, there is nothing special about it. However, the host would do this:

 - compute sha1 (or similar) hash of chunk/length
 - lookup in a session-global chunk directory if that hash was already stored
   if not: write chunk into session-global chunk directory
   make chunk file readonly
 - symlink chunk into session-global directory entry

So if I store path="samples/foo.wav" the first time, the host stores it,
but if I store it the second time, or in another plugin instance, the host
can detect it, and only another symlink is stored.

As for cleanup: if a session-global directory entry is for any reason not
referenced by any symlink, it can be safely deleted by the host.


As a last remark, if Ardour were to automatically dedup atom:Chunk properties
between all plugin saves, we would not even need an extension, I could simply
store the stuff I need to store as atom:Chunk properties.

> > session without duplicating data on each save?
> How about:
> 
>  - save the samples only once and treat them as external files that
> don't change (much like external sample-banks):  use map_path, not
> make_path.
> 
>  - save the meta-data as LV2_ATOM__Chunk. As long as it doesn't change,
> no new state will be saved.

Right. I had still hoped that there is any way around it, but it seems the way
to do it with existing LV2 is to store part of the plugin state in the session
and part of the plugin state outside the session. Also even if map_chunk() was
added today, it would take years until this propagates to all users...

So I'm thinking of storing the instrument chunks in a global directory shared
by all plugin instances on the host, maybe mapping the paths to allow archival.
Since these are now shared between plugins, this means that edit operations in
one plugin instance affect other plugin instance in the same and in different
sessions. Also transferring your work from computer to computer is harder this
way.

And the sha1-based chunk repository cannot be implemented by myself in my
plugin for my plugin. It would almost be possible, but there is one part that
is missing: while the Ardour for instance can detect that a chunk is no longer
needed because there are no longer any symlinks to it, in my plugin I cannot
see if this chunk is needed by any other instance in any other session, so
effectively unused chunks would be piling up forever.

So yes, there is a plan B, and I can implement it. But the problems that plan B
has are because as of now we don't really have any working way to store large
chunks of data into the session.

VST for instance allows this via VST chunks. I tested a few proprietary DAWs
and all of them allow VST chunk sizes of 500M or so. And Ardour even does the
right thing and does not duplicate the chunk over and over again. Its only LV2
in combination with Ardour that makes it impractical to work with large chunks
and the plugin state. Qtractor is somewhat better because it doesn't keep a
state history.

   Cu... Stefan
-- 
Stefan Westerfeld, http://space.twc.de/~stefan


More information about the Devel mailing list