Solr provides an extremely fast way to query all key content on VFB and is the backend used for most queries run on the website. It is pre-populated from the OWL ontology documents describing all the information on the site and their relationships. More details on Solr and its query syntax can be found at http://lucene.apache.org/solr/.
Usage
vfb_solr_query(
query = "*:*",
filterquery = NULL,
fields = "label+short_form",
sort = "score+desc",
defaultfield = "short_form",
rows = 30L,
path = "solr/ontology/select?wt=json",
server = getOption("vfbr.server.solr"),
parse.json = TRUE,
...
)
Arguments
- query
A key-value list specifying the query
- filterquery
A character vector (of length one or more) describing filter queries for solr (see Details for regular vs filter queries)
- fields
Which fields to return (+delimited). A value of
""
implies all fields.- sort
Character vector naming one or more fields (+ delimited) to use for sorting the results.
- defaultfield
Character vector naming default field used for filter queries (defaults to
short_form
)- rows
Maximum number of rows to return. The special value of Inf implies all matching rows.
- path
The path on the server containing the query page
- server
The base url of the server
- parse.json
Whether or no to parse the response (default: TRUE)
- ...
additional solr query arguments
Value
When parse.json=TRUE
, a data.frame containing the parsed
response (originally the response$docs
field in the parsed JSON)
along with additional attributes including
numFound
start
responseHeader
When parse.json=FALSE
an httr::response
object
Details
The query
arguments maps onto the general solr q=
query while filterqueries
maps onto one or more fl=
terms.
The
solr
wiki says this about the difference:
The fq parameter defines a query that can be used to restrict the superset of documents that can be returned, without influencing score. It can be very useful for speeding up complex queries, since the queries specified with fq are cached independently of the main query. When a later query uses the same filter, there's a cache hit, and filter results are returned quickly from the cache.
See also
Other query:
vfb_neo4j_query()
,
vfb_owl_query()
Examples
# Find VFB ids matching a given GMR line
# note the field synonym_autosuggest will in future be the only one
# matching GMR* ids
vfb_solr_query(filterquery="VFB_*",query="synonym_autosuggest:GMR_10A07*")
#> label short_form
#> 1 JRC_R10A07_GAL4_VNC_20080306 VFB_00100chl
#> 2 JRC_R10A07-GAL4_MCFO_VNC_20181121_61_G2_40x VFB_001020qv
#> 3 JRC_R10A07-GAL4_MCFO_VNC_20181121_61_G1_40x VFB_001020qu
#> 4 JRC_R10A07-GAL4_MCFO_VNC_20181121_61_G4_40x VFB_001020qx
#> 5 JRC_R10A07-GAL4_MCFO_VNC_20181121_61_G3_40x VFB_001020qw
#> 6 JRC_R10A07-GAL4_MCFO_VNC_20181121_61_G6_40x VFB_001020qz
#> 7 JRC_R10A07-GAL4_MCFO_VNC_20181121_61_G5_40x VFB_001020qy
# Find VFB ids matching a given VT Gal4 line
vfb_solr_query(filterquery="VFB_*",query="label:VT017929*")
#> data frame with 0 columns and 0 rows
# how many GMR lines can we find
# note use of rows = 0 so we do not fetch results (but still get totals)
r=vfb_solr_query(filterquery="VFB_*",query="label:GMR_*", rows=0)
attr(r,'numFound')
#> [1] 0
# \donttest{
#' # VFB id for all GMR lines
all_gmr=vfb_solr_query(filterquery="VFB_*",query="label:GMR_*", rows=4000)
head(all_gmr)
#> data frame with 0 columns and 0 rows
# VFB id for all FlyCircuit neurons
# note use of rows=Inf to fetch all rows
all_fc=vfb_solr_query(filterquery="VFB_*",
query="source_data_link_annotation:*flycircuit*", rows=Inf)
head(all_fc)
#> data frame with 0 columns and 0 rows
# }