Skip to content

PatentsView/PatentSearch-API

Repository files navigation

PatentsView PatentSearch API

Dev Docs: Overview and Introduction

Concepts

Developing for and maintaining search platform tools requires understanding the topics below:


Software and Tools

Elasticsearch

Python Django Framework

Web Developer Tools

Redis

MySQL

AWS

Docker

Node/Docusaurus

Node can also be installed via Homebrew on Mac. Docusaurus can then be installed by navigating to the docs folder inside your PatentSearch-API clone and running:

npm install

The required Node packages (including Docusaurus) will be detected from package.json.


Setup / Introductory Tutorials

Elasticsearch

Python/Django Framework

Git/GitHub

Docker

Elastic Beanstalk


Topic Tutorials

Elasticsearch


API Implementation Design

Django Rest Framework offers multiple levels of abstraction for serializer and view design. For serializers, these are (listed from more explicit to more abstract)

  1. Explicitly designed and declared Serializer
  2. ModelSerializer

Similarly, for views there are several levels of abstractions (listed from more explicit to more abstract)

  1. Standalone (decorated) Python functions serving various requests (GET, POST, etc.)
  2. Class-based views where the class maps to a URL and methods in the class serve different requests
  3. Class-based views with DRF Mixins
  4. Generic class-based views

The PatentSearch API uses Explicit serializer and Class-based views with custom mixins.


Variable Naming Agreement

There are multiple groups of elements in an API’s configuration that need to be aligned in non-obvious ways. (All paths referenced originate at the root of the PatentSearch-API repo.)

Index Name

By convention this should match the response name where possible, but this is not a strict requirement.

  • The name (or an alias) for the Elasticsearch index that stores the data for the endpoint
  • The string value assigned to self.index in the __init__ method of the {entity}Endpoint class in:
  • API/endpoints/{entity}_endpoint_configuration.py
  • The final named argument of the __init__ method of the {entity}ResponseDocument class in:
  • API/endpoints/{entity}_endpoint_configuration.py

Response Name

By convention this is the plural form of the endpoint path, but this is not a strict requirement.

  • The name of the object attribute assigned in the __init__ method of the {entity}ResponseDocument class in:
    • API/endpoints/{entity}_endpoint_configuration.py
  • The first key under properties under {entity}SuccessResponse in:
    • API/static/openapi.json
  • The variable name assigned to the {entity}Serializer object in the APISerializer class in:
    • API/serializers/APISerializer.py

Endpoint Path

By convention this is the word for a single entity within the endpoint/index (e.g., inventor).

  • The string used as the first argument of the path or re_path function call within the urlpatterns list in:
    • API/urls.py
  • The last portion of the path used as a key for that endpoint under paths in:
    • API/static/openapi.json appended to the standard path prefix:
    • /api/v1/, /api/v1/patent/, or /api/v1/publication/ (corresponds to the Swagger page)

Field Names

  • The variable names in the class {entity}Serializer in:
    • API/endpoints/{entity}_endpoint_configuration.py
  • The keys within:
    • componentsschemas{entity}SuccessResponseproperties{ResponseName}itemsproperties in API/static/openapi.json
  • The GET defaults located at:
    • paths/api/v1/{EndpointPath}getparametersschemadefault in API/static/openapi.json (both for the f and s parameters)
  • The POST defaults located at:
    • componentsschemas{entity}PostRequestBodypropertiesf / sdefault in API/static/openapi.json (both for the f and s parameters)
  • The field names in the Elasticsearch index that stores the data for the endpoint

Endpoint Creation Tutorial

This tutorial walks an onboarding developer step-by-step through creating an endpoint in the PatentSearch-API.

Setup

  • Ensure local dependencies are available (MySQL, Elasticsearch, Redis as needed, and project dependencies installed).
  • Identify the source table(s) and the target Elasticsearch index for the new endpoint.

Creating Data Source

  1. Connect to your local MySQL instance.
  2. Create (or identify) a database and table(s) that will act as the source for the endpoint.
  3. Verify the table(s) contain the expected columns and sample data.

Creating Elasticsearch Schema

  1. Create an Elasticsearch index schema file (JSON) with fields for all columns required by the endpoint. (See Elasticsearch mapping tutorials.)
  2. Create a data loading project/folder in PyCharm/VS Code.
  3. Install the es-data-load package.
  4. Using es-data-load, create the target index in your local Elasticsearch instance.
  5. Connect using Elasticvue and verify the index exists and has the expected mapping.

Loading Data

  1. Create a MySQL → Elasticsearch mapping file.
  2. Run the data load using the es-data-load package.
  3. Verify documents exist in the index (via Elasticvue or Elasticsearch queries).

Creating API Endpoint

Creating Serializer

Create a serializer class that inherits from serializers.Serializer:

Example:

from rest_framework import serializers

class {entity_name}Serializer(serializers.Serializer):
    # Declare the fields included in the endpoint, e.g.:
    # id = serializers.CharField()
    # name = serializers.CharField()
    pass

Creating Endpoint Configuration

Create a generic class to define endpoint configuration and response document structure.

Replace {entity_name} with the endpoint/entity name (e.g., patent, assignee, inventor, etc.) as it applies.

# Defines the wrapper for a list of response documents
class {entity_name}ResponseDocument(APIResponseDocument):
    def __init__(self, error, count, total_hits, {entity_name}):
        super().__init__(error, count, total_hits)
        self.{entity_name} = {entity_name}


# Fields, operators, Elasticsearch, and other configuration for the endpoint
class {entity_name}Endpoint:
    def __init__(self, **kwargs):
        # Elasticsearch Index
        self.index = "{entity_name}"

        # Fields which are configured as "text" type in ES and consequently require
        # ".keyword" suffix for keyword-like operations
        self.keyword_field_translations = []
  
        # Default f and s fields
        self.f = ["id"]
        self.s = [{"id": "asc"}]

        # List of all allowed fields
        self.field_list = list({entity_name}Serializer.__dict__["_declared_fields"].keys())

        # Wrapper that defines the format for API response
        self.response_encoder = {entity_name}ResponseDocument

Creating DRF View Classes (List + Detail)

Create DRF view classes (one for list view and one for detail view):

# DRF View for showing multiple entities
class {entity_name}List(PVAPIListView, {entity_name}Endpoint):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        {entity_name}Endpoint.__init__(self)


# DRF View for showing single entity
class {entity_name}Detail(PVAPIDetailView, {entity_name}Endpoint):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        {entity_name}Endpoint.__init__(self)
        self.pk_field = "id"

Add URL Patterns (urls.py)

Add URL patterns to API/urls.py:

from django.urls import path, re_path

urlpatterns = [
    path("{entity_name}/<str:pk>/", {entity_name}Detail.as_view(), name="{entity_name}-detail"),
    re_path(r"^{entity_name}/?$", {entity_name}List.as_view(), name="{entity_name}-list"),
]

About

PatentSearch-API source code

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors