Search by Metadata in Digital Slide Archive

We’ve set up Digital Slide Archive on a cloud machine. All of the basics like importing and viewing data work fine. However, we are struggling with the search button.
I have an image that has a simple piece of Metadata, Key: Foo, Value: Bar. I’m using the Search box at the top of the DSA UI to find any of those, but I can’t. I’ve tried all 3 search options, Full Text, Prefix, DICOM Metadata. SEarching for either Foo or Bar, I’m unable to get any results.
The help text for the search box is not particularly helpful either, because it does not clarify which text is being searched through.

Can anyone confirm whether metadata search is meant to work in DSA?

1 Like

The search field at the top doesn’t search metadata. You can configure some folders to show metadata in the list of items, and then metadata can be searched within that folder (see Girder Configuration Options — large_image documentation). There are API endpoints that can search metadata (GET item/query and GET folder/query).

The search box could be hooked up to search metadata, but it would probably need to specify what metadata should be searched.

– David

Thanks for the quick reply David! How could we hook up the search box to search metadata? Which configuration controls that? I did not find anything in the Girder configuration that sounded like a related setting.

For context: What we’re trying to do is hook up histoQC with DSA. Run HistoQC on a dataset, then upload the results to DSA as metadata, and then hopefully have an easy way of reviewing the slides that are good/bad.

Do you have any suggestions how to achieve such a workflow with DSA?

What I think would be very helpful is a filter button at the top of a page, to filter by metadata fields, and only display the images that match the filter criterion. If you have a suggestion how to do that with the present DSA capabilities, that would be very helpful.

Anything we do will take a small amount of code. Searching metadata is useful enough, we could just add this to the girder base in one of the plugins (e.g., in the HistomicsUI plugin). Probably the easiest thing is to expose a new search mode “metadata”, where you would expect the user to type <field> <expression> in the search (e.g., “rating good”).

From a database perspective, the problem is that the metadata can contain any keys and there isn’t an efficient way in Mongo to search arbitrary keys, but there is to search specific keys. If the user types a field and expression, then the internal query would be something like {"meta.rating": {"$regex": "good"}}. There is some complexity to make sure we match numeric fields as well as text, but that has been done elsewhere: large_image/itemList.js at 2bb83a49265341e56f8482e4237f6c155a01f2c7 · girder/large_image · GitHub.

I think from a coding perspective, we can register a new search mode and have it formulate the query used internally, which should be pretty straightforward (following the example in the dicom plugin, see girder/__init__.py at master · girder/girder · GitHub and girder/__init__.py at master · girder/girder · GitHub). If you want something more general than typing <field> <expression>, that requires more work in figuring out an efficient mongo query. The DICOM example does a general search, but it really relies on not having many items in the database. I have ~10,000 items with metadata, and trying a similar query using a $where function takes minutes, whereas regex on specific keys is fast .

For us, being able to search specific metadata keys would be good enough. Using your example above, if we can get a special search mode for metadata, that recognize “:”, that would probably cover most of our use cases. Even exact match “:” would cover a lot already. I don’t know too much about numeric fields in the backend, string match would be a good starting point.

I made a PR that adds some search modes. See Add a metadata search mode for items and annotations by manthey · Pull Request #974 · girder/large_image · GitHub

2 Likes