District Core Developer DocsDistrict Core Developer Docs
Developers
Boilerplate
Modules
Bitbucket
Developers
Boilerplate
Modules
Bitbucket
  • Modules

    • ABN
    • ActivityLog
    • AnalyticsApi
    • ApiConnector
    • BlockApi
    • CategoryApi
    • CloneApi
    • CommentApi
    • ContentApi
    • Core
    • Documents
    • EmbedApi
    • Event
    • ExportApi
    • FeatureApi
    • FormApi
    • GTM
    • GalleryApi
    • HelpApi
    • Hotspot
    • IdeaSurvey
    • ImportApi
    • InteractionsApi
    • Intercom
    • MailApi
    • MapApi
    • MapSurvey
    • MediaApi
    • MenuApi
    • MetaTagApi
    • NlpApi
    • NotificationApi
    • Page
    • ParentableContent
    • PaymentApi
    • PermissionsApi
    • Postcode
    • ReCaptcha
    • Redirects
    • Renderer
    • ReportApi
    • RestrictionApi
    • RevisionApi
    • SearchApi
    • Settings
    • ShareableApi
    • Slack
    • SlugApi
    • SubscribableApi
    • Survey
    • Team
    • TenantApi
    • TestApi
    • ThemeApi
    • Timeline
    • TranslationApi
    • Update
    • Users
    • VisualisationApi
    • WorkflowApi
    • Wysiwyg

NlpAPI Module

Natural Language Processing

This module is responsible for processing natural language on user provided text. The most common use case is processing long text input element values (textarea).

The module abstracts a common api for processing and supports multiple processors.

As NLP can be expensive in terms of compute and time, it should always be performed via a background job and its results should always be cached.

Current

  • CoreNLP - self hosted and thus privacy focused.
  • ...todo

Future considerations

  • Google
  • Monkey learn
  • OneAi

How processors work (adding new processors)

See Config/nlp_api.php for configuration. processor defines the default processor and processors defines available processors. Each processor should define a handler which is the class responsible for processing the text.

The all handlers should extend Modules\NlpApi\Processor\BaseNlpProcessor with the process function returning an array with the following structure:

   // Process.
   $processor = new MyCustomProcessor();
   $output = $processor->setText('Some good text. Do some analysis. Some bad text')->process();
   
   // Expected $output
   [
     'sentiment' => 'Positive', // Most common sentiment from all sentences.
     'sentiment_split' => [     // Count of sentiment instances detected.
       ['type' => 'Positive', 'count' => 1],
       ['type' => 'Negative' => 'count' => 1]
       ['type' => 'Neutral' => 'count' => 1]
     ],
     'sentences' => [
       ['sentiment' => 'Positive', 'text' => 'Some good text'],
       ['sentiment' => 'Negative', 'text' => 'Some bad text'],
       ['sentiment' => 'Neutral', 'text' => 'Do some analysis'],
     ] 
   ]

Then in Config/nlp_api.php

  'processor' => 'my_custom_processor',
  'processors' => [
    'my_custom_processor' => [
      'handler' => MyCustomProcessor::class,
    ]
  ]

NlpProcessorService

All processors should be accessed via NlpProcessorService::processText() which does appropriate checks for active processor, if enabled, configs, etc.

FormApi integration

The FormApi implements a few events that allow alteration of submitted data. This module leverages these events to insert and render NLP data.

  • InputElementValueToIndexCacheEvent - Background task that builds all cached values. Via InputElementValueToIndexCacheListener NLP data gets inserted to the cache.
  • InputElementValueToRenderEvent - Called on render of an element value. Via InputElementValueToRenderListener any stored NLP data gets appropriate markup inserted (wrappers around positive/negative text).

Preventing NLP processing when seeding dummy content (surveys)

We don't want to waste compute on processing dummy content!

  • All CI environments should have NLP_PROCESSING_ENABLED=false which will disable processing.
  • When processing is enabled, but we may be seeding dummy content (eg staging). Add __no_nlp__ to the field value. InputElementValueToIndexCacheEvent checks for this and will skip processing. If using FormInputFaker this will get added automatically to the dummy value. @todo - see if there is a better way of doing this that will also work with background jobs.

Providers

CoreNLP

Demo | Docs

@todo update if this changes - At time of writing we are using a self-hosted version of CoreNlp on docker-01. It uses the docker image nlpbox/corenlp and is protected with basic auth. Details in LP. Url is https://nlp.stack.host

CoreNLP supports more than sentiment analysis, if any additional functionality gets added, so should this doc.

Example/Test text

The following provides a good mix of positive/neutral/negative and is useful for testing or demonstration.

Changing the round about to a light is a very bad idea. 
The round about is fine as it is, and allows for smooth traffic flow. 
Adding lights at aitken college is also not a good idea, this school should not be given an preference. 
No other schools have their own lights, and they should be made to wait. 
People on the road should have the preferance. 
The keep clear zone is enough. Do not add more lights. Lights are frustrating and slow down traffic. 
The round about is fine, it works. Don’t reinvent the wheel. Also, merging pains are such a bad idea.
You make people angry and I hear people tooting eachother and road rage every single morning, because people do not want merging lanes. 
They cause unnecessary slowness, accidents and road rage. 
We need two free lanes in each direction with absolutely no merge. 
The lights at the intersection near aitken college are full of issues when people have to merge. 
Or people get in front of you and cut you off at the last second. Leave the round about, no lights at aitken college.

The following is the demo text for Google nlp which is good for named entity recognition.

Google, headquartered in Mountain View (1600 Amphitheatre Pkwy, Mountain View, CA 940430), 
unveiled the new Android phone for $799 at the Consumer Electronic Show. 
Sundar Pichai said in his keynote that users love their new Android phones.

Edit this page
Prev
MetaTagApi
Next
NotificationApi