feat: integrate AWS Comprehend as a chainable PII redactor #461
Open
Pearltechie wants to merge 1 commit intoarakoodev:tsfrom
Open
feat: integrate AWS Comprehend as a chainable PII redactor #461Pearltechie wants to merge 1 commit intoarakoodev:tsfrom
Pearltechie wants to merge 1 commit intoarakoodev:tsfrom
Conversation
…se + RxJS Observable APIs) Closes arakoodev#290
|
CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅ |
Author
|
Quick demo (40s) — tests passing, then both promise and rxjs demos against a stubbed Comprehend showing PII tagged with [TYPE] before going to the LLM. edgechains-pr461-demo.1.mp4 |
Author
|
I have read the Arakoo CLA Document and I hereby sign the CLA |
Author
|
recheck |
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #290
/claim #290
Adds an AWS Comprehend wrapper to
@arakoodev/edgechains.js/aiso prompts can be PII-scrubbed before being sent to an Endpoint (OpenAI, GeminiAI, LlamaAI, RetellAI).The issue asks for "new classes that can be chained with existing Endpoint classes (as observables)". I exposed two surfaces over the same underlying logic so people can pick whichever matches their code:
AWSComprehend- promise API (detectPii,containsPii,redact,redactBatch,chain)redact$,redactPii,redactPiiText,redactPiiBatch- RxJS Observable / operator API that drops into apipe()Files
JS/edgechains/arakoodev/src/ai/src/lib/aws-comprehend/comprehend.ts- the classlib/aws-comprehend/observables.ts- rxjs operatorslib/aws-comprehend/index.ts- barreltests/awsComprehend.test.ts- 17 vitest tests, AWS SDK mockedindex.ts- re-export the new symbolsJS/edgechains/examples/pii-redactor/OpenAIEndpointNew deps in
arakoodev/package.jsonUsage
Promise:
Observable:
redact()options:languageCode,piiEntityTypes(filter to specific types),minConfidence(default 0.5),redactionChar,strategy("char"|"type"|"fixed").Credentials resolved from constructor -> env (
AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY/AWS_SESSION_TOKEN) -> default AWS credential chain. Inputs above 100 KB are rejected before the network call (Comprehend's hard limit).Tests
Tests mock
@aws-sdk/client-comprehendso they run with no AWS account. They cover detection, redaction (all three strategies + entity-type filter + confidence filter), batching, the promisechain()helper, and every observable operator including thefrom -> redactPii -> mergeMap(endpoint)composition that the issue asks for.Demo
IAM
{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Action": ["comprehend:DetectPiiEntities", "comprehend:ContainsPiiEntities"], "Resource": "*" }] }Let me know if you'd like a different naming scheme for the operators or a different default for
concurrency/minConfidence.