API Documentation

Notice: index-specific default values can be set at the Pickmybrain's control panel. Default values are valid until overwritten with Pickmybrain's API. (updated 2017-07-18)


Initialization
<?php
include "PMBApi.php";
$pickmybrain = new PickMyBrain();
?>
Searching

<?php
# First, set the index you want to query.
# (If you haven't set up your index, please read README.txt)
$pickmybrain->SetIndex("myindexname");

# If you want to define custom settings, please do it here between SetIndex and Search-methods.

# Search the index.
$result $pickmybrain->Search("mykeyword");

$result['total_matches']; # number of found documents 
$result['matches']; # array of matches
$result['error']; # if an error occurred
$result['did_you_mean']; # keyword suggestions ( if enabled and present )
# going through matches:
# in database indexes $document_id is the primary key of your data table
foreach ( $result['matches'] as $document_id => $match )
{
    
# Web-index specific fields ---------------->
    
$match['title']; # Page title
    
$match['URL']; # page url
    
$match['content']; # Page content (max 5000 chars)
    
$match['meta']; # meta description ( if available )
    # ------------------------------------------>

    # document score 
    
$match['@score'];

    
# if sorting, grouping or filtering is enabled 
    # and you have manually defined an attribute named myAttribute
    
$match['@myAttribute'];

    
# original indexed data (if IncludeOriginalData is enabled)
    
$match['mySQLcolumn'];

    
# if you want to provide a focused version of your ( long ) 
    # indexed data field with highlighted keywords
    
$long_field "text123abc..."// long text
    
$query "mykeyword"// your search query
    
$stem_language "fi"// language for stemming ( en | fi )
    
$chars_per_line 90// how many chars before forced linebreak ( <br> )
    
$max_len =  150// how many chars the focused result may contain in total 
    
$focused_field $pickmybrain->SearchFocuser($long_field$query$chars_per_line$max_len);
?>

To paginate results, please provide a second parameter for the Search()-method. Also, a third parameter can be provided to get a custom amount of results. These parameters must be whole numbers in either integer or string format.

<?php
# Skip the first 30 matches and return 15 matches after that.
$pickmybrain->Search("mykeyword"3015);
?>

Keyword suggestions

If enabled, this feature can suggest better search terms. If you updated from a previous version (older than v1.04), reindexing is needed.

<?php
$result 
$pickmybrain->Search("misstyped words");

if ( !empty(
$result['did_you_mean']) )
    
# outputs 'mistyped words' if there is a clear difference in occurrence ( mistyped > misstyped )
    
echo $result['did_you_mean']; 
}
?>

Extended syntax

To exclude documents containing a certain keyword, provide that keyword with a hyphen (-) in front of it.

<?php
$pickmybrain
->Search("wanted -unwanted");
?>

To find exact matches, wrap the keyword around quotes, like "mykeyword".

<?php

this query returns documents that have these exact keywords
$pickmybrain
->Search('"exactkeyword1" "exactkeyword2" "exactkeyword3"');

# this query returns documents that have these exact keywords in this particular order
$pickmybrain->Search('"exactkeyword1 exactkeyword2 exactkeyword3"');

# ( ) operators allow prefix matches as well, but maintain the requirement for given keyword order
$pickmybrain->Search('(fuzzykeyword1 fuzzykeyword2 fuzzykeyword3)');

?>

Keyword stemming can be disabled for selected keywords with an asterisk ( mykeyword* ). This will decrease the amount of prefix/infix matches often resulting in smaller amount of document matches.

<?php
$pickmybrain
->Search('mykeyword*');
?>

Selecting matching mode

Matching mode controls which documents are considered as matches.

<?php

$pickmybrain
->SetMatchingMode(PMB_MATCH_ALL);

?>

Available matching modes are:
  • PMB_MATCH_ANY ( one of the provided keywords must be found from the document )
  • PMB_MATCH_ALL ( all provided keywords must be found from the document, default )
  • PMB_MATCH_STRICT ( PMB_MATCH_ALL + the first keyword must be first token of some field )
Selecting ranking mode

Ranking modes have effect when sorting is done completely or partially by relevance.

<?php

$pickmybrain
->SetRankingMode(PMB_RANK_PROXIMITY_BM25);

?>

Available ranking modes are:
  • PMB_RANK_PROXIMITY_BM25 ( phrase proximity and bm25 rankers, DEFAULT )
  • PMB_RANK_BM25 ( bm25 ranker only )
  • PMB_RANK_PROXIMITY ( phrase proximity ranker only )
Attributes

Attributes are document specific values, that can be used for sorting, grouping and filtering results.

Web-based indexes have three internal attributes: domain, timestamp and category.

  • domain is crc32 checksum of lowercase domain name with subdomain ( except www. )
  • timestamp is unix timestamp of the indexing time
  • category is the category id, if categories have been defined and document matches some predefined category

Both index types have three internal virtual attributes available for sorting results: @id, @count and @score.

  • @id orders results by document id, good and fast option when indexed data is naturally ordered in some manner
  • @count works only when group by is enabled, sorts result groups by their respective (sub) result counts
  • @score orders results by their respective scores ( ranking mode has effect )

Both index types have three internal virtual attributes available for grouping results inside result groups: @id, @sentiscore and @score.
  • @id orders results by document id, good and fast option when indexed data is naturally ordered in some manner
  • @sentiscore orders results by their respective sentiment analysis scores ( requires sentiment analysis )
  • @score orders results by their respective scores ( ranking mode has effect )

Database indexes do not have any internal attributes, but user is free to define his/her own attributes at the control panel.

Selecting sorting mode
<?php
$pickmybrain
->SetSortMode(PMB_SORTBY_RELEVANCE);

# If sorting mode is set to PMB_SORTBY_ATTR, a secondary parameter must be provided.
# The secondary parameter must contain the sorting attribute and sorting direction ( asc/desc )
# Virtual attributes @score and @id can also be used.
$pickmybrain->SetSortMode(PMB_SORTBY_ATTR"myAttribute DESC");
 
?>

Available sorting modes are:
  • PMB_SORTBY_RELEVANCE ( sort by descending document score, DEFAULT )
  • PMB_SORTBY_POSITIVITY ( the most positive matches first )
  • PMB_SORTBY_NEGATIVITY ( the most negative matches first )
  • PMB_SORTBY_ATTR ( sort by user provided predefined attribute )
Grouping results

Sometimes grouping results may be useful. For this to work, custom attributes must be defined for database indexes.

<?php

# If grouping mode is PMB_GROUPBY_ATTR, a secondary and a tertiary parameter must be provided.
# the second parameter controls which attribute is used to form result groups
# the third parameter controls how results are sorted within these groups.
$pickmybrain->SetGroupBy(PMB_GROUPBY_ATTR"topicID""@score DESC");
# The example above groups results by attribute topicID.
# The document with highest score is returned from each group.

# Grouping can be disabled with the following method:
$pickmybrain->ResetGroupBy();
?>

Available ranking modes are:
  • PMB_GROUPBY_DISABLED ( grouping is disabled, default )
  • PMB_GROUPBY_ATTR ( group by user provided predefined attribute )
Filtering results

Sometimes it may be necessary to limit searching to documents with certain attribute values.

<?php

# Filtering by singular values:
$pickmybrain->SetFilterBy("myAttribute"1);
$pickmybrain->SetFilterBy("myAttribute"2);
# The example above limits searching on documents with myAttribute value of 1 or 2.

# Besides singular values, it is also possible to define a value range.
$pickmybrain->SetFilterRange("myAttribute"110);
# The example above limits searching on documents with myAttribute value >= 1 and <=10

# If you want to exclude documents with certain attribute values,
# please provide the attribute with an exclamation mark
$pickmybrain->SetFilterBy("!myAttribute"2);

# All filters can be resetted with the following method:
$pickmybrain->ResetFilters();
?>
Setting custom field weights

Choose whether to give certain fields ( like titles ) more weight during searching. Web-indexes have four pre-defined fields: title, content, url and meta. Database index fields are named exactly like the selected columns in the SQL query.

<?php
# For setting custom field weights, please populate an associative array in the following manner:
$field_weights = array("title" => 10"meta" => 1);

# Then pass it as parameter to the SetFieldWeights-method.
$pickmybrain->SetFieldWeights($field_weights);
?>
Limit searching to certain fields

By default Pickmybrain searches all data fields in your search index. Fields can be excluded from the search by setting their field weights to zero.

<?php
# Exclude all fields except title:
$field_weights = array("title" => 1"content" => 0"url" => 0"meta" => 0);

# Deploy the field settings:
$pickmybrain->SetFieldWeights($field_weights);
?>
Include original data

If you want to include original indexed data with your search results, you can do it by enabling the include original data setting.

<?php
# accepted values are:
# boolean true or integer 1 for enabled
# boolean false or integer 0 for disabled
$pickmybrain->IncludeOriginalData(value);
?>
Stem keywords

Stemming incoming keywords may be beneficial if the search index contains prefixes, as this will most certainly return more results.

<?php
# accepted values are:
# boolean true or integer 1 for enabled
# boolean false or integer 0 for disabled
$pickmybrain->KeywordStemming(value);
?>
Stemmed keyword minimum length

This setting controls how much shorter the stemmed version of the keyword is allowed to be compared to the original keyword

<?php
# The stemmed version must be at least 70 percent in length 
# of the original keyword, otherwise it will be discarded
$pickmybrain->StemMinimumQuality(70);
?>
Dialect matching

This feature removes dialect from user-provided keywords. Either the original keyword or the processed keyword is required to match.

<?php
# Examples:
# (Disabled) INPUT: räikkönen OUTPUT: räikkönen
# (Enabled)  INPUT: räikkönen OUTPUT: räikkönen OR raikkonen
# accepted values are:
# boolean true or integer 1 for enabled
# boolean false or integer 0 for disabled
$pickmybrain->DialectMatching(value);
?>
Match quality scoring

If given search term matches prefix, postfix or an infix of another word, this option chooses whether these kind of matches will be treated as equal or non-equal to exact matches. If this feature is disabled, each prefix will have a score of 1.

<?php
# accepted values are:
# boolean true or integer 1 for enabled
# boolean false or integer 0 for disabled
$pickmybrain->QualityScoring(value);
?>
Quality score lower limit

You might want to limit the amount of lower quality prefix matches for the given keyword(s). This is possible with QualityScoreLimit(int $percentage) method.

<?php
# the matched tokens must containg at least 50 percent of the original keyword
$pickmybrain->QualityScoreLimit(50);

# let's search for 'car'
$result $pickmybrain->Search('car');
# this will match 'cars', but not 'carpool', because 
# the latter is only a 43 percent match against the original keyword
?>
Prefix/Infix/Suffix expansion limit

Limits the amount of prefixes, postfixes and suffixes that the search term can match. Closest results come first. Larger value means more results but slower operation.

<?php
# method expects an integer value
# if the value is defined zero, prefix matching will be disabled
$expansion_limit 32# 32 best matches
$pickmybrain->ExpansionLimit($expansion_limit);
?>
Disable documents and exclude them from the resultset

If you have a big search index, re-indexing might be out of the question. Outdated results can be hid with the following method:

<?php

$index_name 
"myindex"# your index name
$array_of_deprecated_ids = array(101215); # array of document ids 

# disable given document ids
$pickmybrain->DisableDocuments($index_name$array_of_deprecated_ids);

# If you want to re-enable the disabled documents:
$pickmybrain->ResetDisabledDocuments($index_name);
?>