Query VFB via solr indexing system

Solr provides an extremely fast way to query all key content on VFB and is the backend used for most queries run on the website. It is pre-populated from the OWL ontology documents describing all the information on the site and their relationships. More details on Solr and its query syntax can be found at http://lucene.apache.org/solr/.

Usage

vfb_solr_query(
  query = "*:*",
  filterquery = NULL,
  fields = "label+short_form",
  sort = "score+desc",
  defaultfield = "short_form",
  rows = 30L,
  path = "solr/ontology/select?wt=json",
  server = getOption("vfbr.server.solr"),
  parse.json = TRUE,
  ...
)

Arguments

query: A key-value list specifying the query
filterquery: A character vector (of length one or more) describing filter queries for solr (see Details for regular vs filter queries)
fields: Which fields to return (+delimited). A value of "" implies all fields.
sort: Character vector naming one or more fields (+ delimited) to use for sorting the results.
defaultfield: Character vector naming default field used for filter queries (defaults to short_form)
rows: Maximum number of rows to return. The special value of Inf implies all matching rows.
path: The path on the server containing the query page
server: The base url of the server
parse.json: Whether or no to parse the response (default: TRUE)
...: additional solr query arguments

Value

When parse.json=TRUE, a data.frame containing the parsed response (originally the response$docs field in the parsed JSON) along with additional attributes including

numFound
start
responseHeader

When parse.json=FALSE an httr::response object

Details

The query arguments maps onto the general solr q= query while filterqueries maps onto one or more fl= terms. The solr wiki says this about the difference:

The fq parameter defines a query that can be used to restrict the superset of documents that can be returned, without influencing score. It can be very useful for speeding up complex queries, since the queries specified with fq are cached independently of the main query. When a later query uses the same filter, there's a cache hit, and filter results are returned quickly from the cache.

Examples

# Find VFB ids matching a given GMR line
# note the field synonym_autosuggest will in future be the only one
# matching GMR* ids
vfb_solr_query(filterquery="VFB_*",query="synonym_autosuggest:GMR_10A07*")
#>                                         label   short_form
#> 1                JRC_R10A07_GAL4_VNC_20080306 VFB_00100chl
#> 2 JRC_R10A07-GAL4_MCFO_VNC_20181121_61_G2_40x VFB_001020qv
#> 3 JRC_R10A07-GAL4_MCFO_VNC_20181121_61_G1_40x VFB_001020qu
#> 4 JRC_R10A07-GAL4_MCFO_VNC_20181121_61_G4_40x VFB_001020qx
#> 5 JRC_R10A07-GAL4_MCFO_VNC_20181121_61_G3_40x VFB_001020qw
#> 6 JRC_R10A07-GAL4_MCFO_VNC_20181121_61_G6_40x VFB_001020qz
#> 7 JRC_R10A07-GAL4_MCFO_VNC_20181121_61_G5_40x VFB_001020qy

# Find VFB ids matching a given VT Gal4 line
vfb_solr_query(filterquery="VFB_*",query="label:VT017929*")
#> data frame with 0 columns and 0 rows

# how many GMR lines can we find
# note use of rows = 0 so we do not fetch results (but still get totals)
r=vfb_solr_query(filterquery="VFB_*",query="label:GMR_*", rows=0)
attr(r,'numFound')
#> [1] 0
# \donttest{
#' # VFB id for all GMR lines
all_gmr=vfb_solr_query(filterquery="VFB_*",query="label:GMR_*", rows=4000)
head(all_gmr)
#> data frame with 0 columns and 0 rows

# VFB id for all FlyCircuit neurons
# note use of rows=Inf to fetch all rows
all_fc=vfb_solr_query(filterquery="VFB_*",
  query="source_data_link_annotation:*flycircuit*", rows=Inf)
head(all_fc)
#> data frame with 0 columns and 0 rows
# }

Usage

Arguments

Value

Details

See also

Examples