initial public commit

pull/50/head
Vladimir Mandic 2020-10-11 19:22:43 -04:00
parent f47efcddff
commit de01483bc5
71 changed files with 53239 additions and 0 deletions

54
.eslintrc.json Normal file
View File

@ -0,0 +1,54 @@
{
"globals": {},
"env": {
"browser": true,
"commonjs": true,
"es6": true,
"node": true,
"jquery": true,
"es2020": true
},
"parserOptions": { "ecmaVersion": 2020 },
"plugins": [ ],
"extends": [
"eslint:recommended",
"plugin:import/errors",
"plugin:import/warnings",
"plugin:node/recommended",
"plugin:promise/recommended",
"plugin:json/recommended-with-comments",
"airbnb-base"
],
"ignorePatterns": [ "dist", "assets", "media", "models", "node_modules" ],
"rules": {
"max-len": [1, 275, 3],
"camelcase": "off",
"guard-for-in": "off",
"prefer-template":"off",
"import/extensions": "off",
"func-names": "off",
"no-await-in-loop": "off",
"no-bitwise": "off",
"no-case-declarations":"off",
"no-continue": "off",
"no-loop-func": "off",
"no-mixed-operators": "off",
"no-param-reassign":"off",
"no-plusplus": "off",
"dot-notation": "off",
"no-restricted-globals": "off",
"no-restricted-syntax": "off",
"no-underscore-dangle": "off",
"newline-per-chained-call": "off",
"node/no-unsupported-features/es-syntax": "off",
"node/shebang": "off",
"object-curly-newline": "off",
"prefer-destructuring": "off",
"promise/always-return": "off",
"promise/catch-or-return": "off",
"promise/no-nesting": "off",
"import/no-absolute-path": "off",
"no-regex-spaces": "off",
"radix": "off"
}
}

1
.gitignore vendored Normal file
View File

@ -0,0 +1 @@
node_modules

173
README.md Normal file
View File

@ -0,0 +1,173 @@
# Human: 3D Face Detection, Body Pose, Hand & Finger Tracking, Iris Tracking and Age & Gender Prediction
URL: <https://github.com/vladmandic/human>
*Suggestions are welcome!*
## Credits
This is an amalgamation of multiple existing models:
- Face Detection: [**MediaPipe BlazeFace**](https://drive.google.com/file/d/1f39lSzU5Oq-j_OXgS67KfN5wNsoeAZ4V/view)
- Facial Spacial Geometry: [**MediaPipe FaceMesh**](https://drive.google.com/file/d/1VFC_wIpw4O7xBOiTgUldl79d9LA-LsnA/view)
- Eye Iris Details: [**MediaPipe Iris**](https://drive.google.com/file/d/1bsWbokp9AklH2ANjCfmjqEzzxO1CNbMu/view)
- Hand Detection & Skeleton: [**MediaPipe HandPose**](https://drive.google.com/file/d/1sv4sSb9BSNVZhLzxXJ0jBv9DqD-4jnAz/view)
- Body Pose Detection: [**PoseNet**](https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5)
- Age & Gender Prediction: [**SSR-Net**](https://github.com/shamangary/SSR-Net)
## Install
```shell
npm install @vladmandic/human
```
All pre-trained models are included in folder `/models` (25MB total)
## Demo
Demo is included in `/demo`
## Requirements
`Human` library is based on [TensorFlow/JS (TFJS)](js.tensorflow.org), but does not package it to allow for indepdenent version management - import `tfjs` before importing `Human`
## Usage
`Human` library does not require special initialization.
All configuration is done in a single JSON object and all model weights will be dynamically loaded upon their first usage(and only then, `Human` will not load weights that it doesn't need according to configuration).
There is only *ONE* method you need:
```js
import * as tf from '@tensorflow/tfjs';
import human from '@vladmandic/human';
// 'image': can be of any type of an image object: HTMLImage, HTMLVideo, HTMLMedia, Canvas, Tensor4D
// 'options': optional parameter used to override any options present in default configuration
const results = await human.detect(image, options?)
```
Additionally, `Human` library exposes two classes:
```js
human.defaults // default configuration object
human.models // dynamically maintained object of any loaded models
```
## Configuration
Below is output of `human.defaults` object
Any property can be overriden by passing user object during `human.detect()`
Note that user object and default configuration are merged using deep-merge, so you do not need to redefine entire configuration
```js
human.defaults = {
face: {
enabled: true,
detector: {
modelPath: '/models/human/blazeface/model.json',
maxFaces: 10,
skipFrames: 5,
minConfidence: 0.8,
iouThreshold: 0.3,
scoreThreshold: 0.75,
},
mesh: {
enabled: true,
modelPath: '/models/human/facemesh/model.json',
},
iris: {
enabled: true,
modelPath: '/models/human/iris/model.json',
},
age: {
enabled: true,
modelPath: '/models/human/ssrnet-imdb-age/model.json',
skipFrames: 5,
},
gender: {
enabled: true,
modelPath: '/models/human/ssrnet-imdb-gender/model.json',
},
},
body: {
enabled: true,
modelPath: '/models/human/posenet/model.json',
maxDetections: 5,
scoreThreshold: 0.75,
nmsRadius: 20,
},
hand: {
enabled: true,
skipFrames: 5,
minConfidence: 0.8,
iouThreshold: 0.3,
scoreThreshold: 0.75,
detector: {
anchors: '/models/human/handdetect/anchors.json',
modelPath: '/models/human/handdetect/model.json',
},
skeleton: {
modelPath: '/models/human/handskeleton/model.json',
},
},
};
```
Where:
- `enabled`: controls if specified modul is enabled (note: module is not loaded until it is required)
- `modelPath`: path to specific pre-trained model weights
- `maxFaces`, `maxDetections`: how many faces or people are we trying to analyze. limiting number in busy scenes will result in higher performance
- `skipFrames`: how many frames to skip before re-running bounding box detection (e.g., face position does not move fast within a video, so it's ok to use previously detected face position and just run face geometry analysis)
- `minConfidence`: threshold for discarding a prediction
- `iouThreshold`: threshold for deciding whether boxes overlap too much in non-maximum suppression
- `scoreThreshold`: threshold for deciding when to remove boxes based on score in non-maximum suppression
- `nmsRadius`: radius for deciding points are too close in non-maximum suppression
## Outputs
Result of `humand.detect()` is a single object that includes data for all enabled modules and all detected objects:
```js
result = {
face: // <array of detected objects>
[
{
confidence: // <number>
box: // <array [x, y, width, height]>
mesh: // <array of points [x, y, z]> (468 base points & 10 iris points)
annotations: // <list of object { landmark: array of points }> (32 base annotated landmarks & 2 iris annotations)
iris: // <number> (relative distance of iris to camera, multiple by focal lenght to get actual distance)
age: // <number> (estimated age)
gender: // <string> (male or female)
}
],
body: // <array of detected objects>
[
{
score: // <number>,
keypoints: // <array of landmarks [ score, landmark, position [x, y] ]> (17 annotated landmarks)
}
],
hand: // <array of detected objects>
[
confidence: // <number>,
box: // <array [x, y, width, height]>,
landmarks: // <array of points [x, y,z]> (21 points)
annotations: // <array of landmarks [ landmark: <array of points> ]> (5 annotated landmakrs)
]
}
```
## Performance
Of course, performance will vary depending on your hardware, but also on number of enabled modules as well as their parameters.
For example, on a low-end nVidia GTX1050 it can perform face detection at 50+ FPS, but drop to <5 FPS if all modules are enabled.
## Todo
- Improve detection of smaller faces, add BlazeFace back model
- Create demo, host it on gitpages
- Implement draw helper functions
- Sample Images
- Rename human to human

25
demo/index.html Normal file
View File

@ -0,0 +1,25 @@
<head>
<script src="https://cdn.jsdelivr.net/npm/three@0.106.2/build/three.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/scatter-gl@0.0.1/lib/scatter-gl.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/tensorflow/2.6.0/tf.es2017.min.js"></script>
<style>
.canvas-wrapper { display: inline-block; vertical-align: top; }
#scatter-gl-container { display: inline-block; vertical-align: top; border: solid 1px black; position: relative; }
#scatter-gl-container canvas { transform: translate3d(-50%, -50%, 0); left: 50%; top: 50%; position: absolute; }
</style>
</head>
<body>
<div id="main">
<div class="container">
<div class="canvas-wrapper">
<canvas id="output"></canvas>
<video id="video" playsinline style="visibility: hidden; width: auto; height: auto">
</video>
</div>
<div id="scatter-gl-container"></div>
<div id="faces"></div>
</div>
</div>
</body>
<script src="https://cdnjs.cloudflare.com/ajax/libs/dat-gui/0.7.6/dat.gui.min.js"></script>
<script type="module" src="./index.js"></script>

120
demo/index.js Normal file
View File

@ -0,0 +1,120 @@
/* global tf, ScatterGL, dat */
import human from '../dist/human.esm.js';
const state = {
backend: 'webgl',
triangulateMesh: true,
renderPointcloud: true,
stop: false,
videoSize: 700,
};
const options = {
};
let ctx;
let videoWidth;
let videoHeight;
let video;
let canvas;
let scatterGLHasInitialized = false;
let scatterGL;
async function renderPrediction() {
const predictions = await human.detect(video);
ctx.drawImage(video, 0, 0, videoWidth, videoHeight, 0, 0, canvas.width, canvas.height);
const div = document.getElementById('faces');
div.innerHTML = '';
for (const prediction of predictions) {
div.appendChild(prediction.canvas);
ctx.beginPath();
ctx.rect(prediction.box[0], prediction.box[1], prediction.box[2], prediction.box[3]);
ctx.font = 'small-caps 1rem "Segoe UI"';
ctx.fillText(`${prediction.gender} ${prediction.age}`, prediction.box[0] + 2, prediction.box[1] + 16, prediction.box[2]);
ctx.stroke();
if (state.triangulateMesh) {
for (let i = 0; i < human.triangulation.length / 3; i++) {
const points = [human.triangulation[i * 3], human.triangulation[i * 3 + 1], human.triangulation[i * 3 + 2]].map((index) => prediction.mesh[index]);
const region = new Path2D();
region.moveTo(points[0][0], points[0][1]);
for (let j = 1; i < points.length; j++) region.lineTo(points[j][0], points[j][1]);
region.closePath();
ctx.stroke(region);
}
} else {
for (let i = 0; i < prediction.mesh.length; i++) {
const x = prediction.mesh[i][0];
const y = prediction.mesh[i][1];
ctx.beginPath();
ctx.arc(x, y, 1 /* radius */, 0, 2 * Math.PI);
ctx.fill();
}
}
if (state.renderPointcloud && scatterGL != null) {
const pointsData = predictions.map((pred) => pred.mesh.map((point) => ([-point[0], -point[1], -point[2]])));
let flattenedPointsData = [];
for (let i = 0; i < pointsData.length; i++) {
flattenedPointsData = flattenedPointsData.concat(pointsData[i]);
}
const dataset = new ScatterGL.Dataset(flattenedPointsData);
if (!scatterGLHasInitialized) scatterGL.render(dataset);
else scatterGL.updateDataset(dataset);
scatterGLHasInitialized = true;
}
}
if (!state.stop) requestAnimationFrame(renderPrediction);
}
function setupDatGui() {
const gui = new dat.GUI();
gui.add(state, 'stop').onChange(() => { renderPrediction(); });
gui.add(state, 'backend', ['webgl', 'cpu']).onChange((backend) => { tf.setBackend(backend); });
gui.add(options, 'maxFaces', 1, 100, 1).onChange(() => { human.load(options); });
gui.add(options, 'detectionConfidence', 0, 1, 0.05).onChange(() => { human.load(options); });
gui.add(options, 'iouThreshold', 0, 1, 0.05).onChange(() => { human.load(options); });
gui.add(options, 'scoreThreshold', 0, 1, 0.05).onChange(() => { human.load(options); });
gui.add(state, 'triangulateMesh');
gui.add(state, 'renderPointcloud').onChange((render) => { document.querySelector('#scatter-gl-container').style.display = render ? 'inline-block' : 'none'; });
}
async function setupCamera() {
video = document.getElementById('video');
const stream = await navigator.mediaDevices.getUserMedia({
audio: false,
video: { facingMode: 'user', width: state.videoSize, height: state.videoSize },
});
video.srcObject = stream;
return new Promise((resolve) => {
video.onloadedmetadata = () => resolve(video);
});
}
async function main() {
await tf.setBackend(state.backend);
setupDatGui();
await setupCamera();
video.play();
videoWidth = video.videoWidth;
videoHeight = video.videoHeight;
video.width = videoWidth;
video.height = videoHeight;
canvas = document.getElementById('output');
canvas.width = videoWidth;
canvas.height = videoHeight;
const canvasContainer = document.querySelector('.canvas-wrapper');
canvasContainer.style = `width: ${videoWidth}px; height: ${videoHeight}px`;
ctx = canvas.getContext('2d');
// ctx.translate(canvas.width, 0);
// ctx.scale(-1, 1);
ctx.fillStyle = '#32EEDB';
ctx.strokeStyle = '#32EEDB';
ctx.lineWidth = 0.5;
human.load(options);
renderPrediction();
if (state.renderPointcloud) {
document.querySelector('#scatter-gl-container').style = `width: ${state.videoSize}px; height: ${state.videoSize}px;`;
scatterGL = new ScatterGL(document.querySelector('#scatter-gl-container'), { rotateOnStart: false, selectEnabled: false });
}
}
main();

2640
dist/human.esm.js vendored Normal file

File diff suppressed because it is too large Load Diff

7
dist/human.esm.js.map vendored Normal file

File diff suppressed because one or more lines are too long

4006
dist/human.js vendored Normal file

File diff suppressed because one or more lines are too long

7
dist/human.js.map vendored Normal file

File diff suppressed because one or more lines are too long

Binary file not shown.

File diff suppressed because one or more lines are too long

Binary file not shown.

File diff suppressed because one or more lines are too long

17666
models/handdetect/anchors.json Normal file

File diff suppressed because it is too large Load Diff

Binary file not shown.

Binary file not shown.

11712
models/handdetect/model.json Normal file

File diff suppressed because it is too large Load Diff

Binary file not shown.

Binary file not shown.

11619
models/handskeleton/model.json Normal file

File diff suppressed because it is too large Load Diff

Binary file not shown.

1
models/iris/model.json Normal file

File diff suppressed because one or more lines are too long

Binary file not shown.

Binary file not shown.

File diff suppressed because one or more lines are too long

Binary file not shown.

File diff suppressed because one or more lines are too long

Binary file not shown.

File diff suppressed because one or more lines are too long

Binary file not shown.

File diff suppressed because one or more lines are too long

Binary file not shown.

File diff suppressed because one or more lines are too long

1994
package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

51
package.json Normal file
View File

@ -0,0 +1,51 @@
{
"name": "@vladmandic/human",
"version": "0.1.3",
"description": "human: 3D Face Detection, Iris Tracking and Age & Gender Prediction",
"sideEffects": false,
"main": "src/index.js",
"module": "dist/human.esm.js",
"browser": "dist/human.js",
"author": "Vladimir Mandic <mandic00@live.com>",
"bugs": {
"url": "https://github.com/vladmandic/human/issues"
},
"homepage": "https://github.com/vladmandic/human#readme",
"license": "MIT",
"engines": {
"node": ">=14.0.0"
},
"repository": {
"type": "git",
"url": "git+https://github.com/vladmandic/human.git"
},
"dependencies": {
"@tensorflow/tfjs": "^2.6.0"
},
"devDependencies": {
"esbuild": "^0.7.13",
"eslint": "^7.10.0",
"eslint-config-airbnb-base": "^14.2.0",
"eslint-plugin-import": "^2.22.1",
"eslint-plugin-json": "^2.1.2",
"eslint-plugin-node": "^11.1.0",
"eslint-plugin-promise": "^4.2.1",
"rimraf": "^3.0.2"
},
"scripts": {
"build": "rimraf dist/ && npm run build-esm && npm run build-iife",
"build-esm": "esbuild --bundle --platform=browser --sourcemap --target=esnext --format=esm --external:@tensorflow --outfile=dist/human.esm.js src/index.js",
"build-iife": "esbuild --bundle --platform=browser --sourcemap --target=esnext --format=iife --minify --global-name=human --outfile=dist/human.js src/index.js"
},
"keywords": [
"face detection",
"detection",
"recognition",
"blazeface",
"facemesh",
"ssrnet",
"tensorflow",
"tensorflowjs",
"tfjs"
]
}

20
src/blazeface/box.js Normal file
View File

@ -0,0 +1,20 @@
const tf = require('@tensorflow/tfjs');
exports.disposeBox = (box) => {
box.startEndTensor.dispose();
box.startPoint.dispose();
box.endPoint.dispose();
};
exports.createBox = (startEndTensor) => ({
startEndTensor,
startPoint: tf.slice(startEndTensor, [0, 0], [-1, 2]),
endPoint: tf.slice(startEndTensor, [0, 2], [-1, 2]),
});
exports.scaleBox = (box, factors) => {
const starts = tf.mul(box.startPoint, factors);
const ends = tf.mul(box.endPoint, factors);
const newCoordinates = tf.concat2d([starts, ends], 1);
return exports.createBox(newCoordinates);
};

188
src/blazeface/face.js Normal file
View File

@ -0,0 +1,188 @@
const tf = require('@tensorflow/tfjs');
const bounding = require('./box');
const ANCHORS_CONFIG = {
strides: [8, 16],
anchors: [2, 6],
};
const NUM_LANDMARKS = 6;
function generateAnchors(width, height, outputSpec) {
const anchors = [];
for (let i = 0; i < outputSpec.strides.length; i++) {
const stride = outputSpec.strides[i];
const gridRows = Math.floor((height + stride - 1) / stride);
const gridCols = Math.floor((width + stride - 1) / stride);
const anchorsNum = outputSpec.anchors[i];
for (let gridY = 0; gridY < gridRows; gridY++) {
const anchorY = stride * (gridY + 0.5);
for (let gridX = 0; gridX < gridCols; gridX++) {
const anchorX = stride * (gridX + 0.5);
for (let n = 0; n < anchorsNum; n++) {
anchors.push([anchorX, anchorY]);
}
}
}
}
return anchors;
}
function decodeBounds(boxOutputs, anchors, inputSize) {
const boxStarts = tf.slice(boxOutputs, [0, 1], [-1, 2]);
const centers = tf.add(boxStarts, anchors);
const boxSizes = tf.slice(boxOutputs, [0, 3], [-1, 2]);
const boxSizesNormalized = tf.div(boxSizes, inputSize);
const centersNormalized = tf.div(centers, inputSize);
const halfBoxSize = tf.div(boxSizesNormalized, 2);
const starts = tf.sub(centersNormalized, halfBoxSize);
const ends = tf.add(centersNormalized, halfBoxSize);
const startNormalized = tf.mul(starts, inputSize);
const endNormalized = tf.mul(ends, inputSize);
const concatAxis = 1;
return tf.concat2d([startNormalized, endNormalized], concatAxis);
}
function scaleBoxFromPrediction(face, scaleFactor) {
return tf.tidy(() => {
const box = face['box'] ? face['box'] : face;
return bounding.scaleBox(box, scaleFactor).startEndTensor.squeeze();
});
}
class BlazeFaceModel {
constructor(model, config) {
this.blazeFaceModel = model;
this.width = config.detector.inputSize;
this.height = config.detector.inputSize;
this.maxFaces = config.detector.maxFaces;
this.anchorsData = generateAnchors(config.detector.inputSize, config.detector.inputSize, ANCHORS_CONFIG);
this.anchors = tf.tensor2d(this.anchorsData);
this.inputSizeData = [config.detector.inputSize, config.detector.inputSize];
this.inputSize = tf.tensor1d([config.detector.inputSize, config.detector.inputSize]);
this.iouThreshold = config.detector.iouThreshold;
this.scoreThreshold = config.detector.scoreThreshold;
}
async getBoundingBoxes(inputImage, returnTensors, annotateBoxes = true) {
const [detectedOutputs, boxes, scores] = tf.tidy(() => {
const resizedImage = inputImage.resizeBilinear([this.width, this.height]);
const normalizedImage = tf.mul(tf.sub(resizedImage.div(255), 0.5), 2);
// [1, 897, 17] 1 = batch, 897 = number of anchors
const batchedPrediction = this.blazeFaceModel.predict(normalizedImage);
const prediction = batchedPrediction.squeeze();
const decodedBounds = decodeBounds(prediction, this.anchors, this.inputSize);
const logits = tf.slice(prediction, [0, 0], [-1, 1]);
const scoresOut = tf.sigmoid(logits).squeeze();
return [prediction, decodedBounds, scoresOut];
});
const boxIndicesTensor = await tf.image.nonMaxSuppressionAsync(boxes, scores, this.maxFaces, this.iouThreshold, this.scoreThreshold);
const boxIndices = await boxIndicesTensor.array();
boxIndicesTensor.dispose();
let boundingBoxes = boxIndices.map((boxIndex) => tf.slice(boxes, [boxIndex, 0], [1, -1]));
if (!returnTensors) {
boundingBoxes = await Promise.all(boundingBoxes.map(async (boundingBox) => {
const vals = await boundingBox.array();
boundingBox.dispose();
return vals;
}));
}
const originalHeight = inputImage.shape[1];
const originalWidth = inputImage.shape[2];
let scaleFactor;
if (returnTensors) {
scaleFactor = tf.div([originalWidth, originalHeight], this.inputSize);
} else {
scaleFactor = [
originalWidth / this.inputSizeData[0],
originalHeight / this.inputSizeData[1],
];
}
const annotatedBoxes = [];
for (let i = 0; i < boundingBoxes.length; i++) {
const boundingBox = boundingBoxes[i];
const annotatedBox = tf.tidy(() => {
const box = boundingBox instanceof tf.Tensor
? bounding.createBox(boundingBox)
: bounding.createBox(tf.tensor2d(boundingBox));
if (!annotateBoxes) {
return box;
}
const boxIndex = boxIndices[i];
let anchor;
if (returnTensors) {
anchor = this.anchors.slice([boxIndex, 0], [1, 2]);
} else {
anchor = this.anchorsData[boxIndex];
}
const landmarks = tf.slice(detectedOutputs, [boxIndex, NUM_LANDMARKS - 1], [1, -1])
.squeeze()
.reshape([NUM_LANDMARKS, -1]);
const probability = tf.slice(scores, [boxIndex], [1]);
return { box, landmarks, probability, anchor };
});
annotatedBoxes.push(annotatedBox);
}
boxes.dispose();
scores.dispose();
detectedOutputs.dispose();
return {
boxes: annotatedBoxes,
scaleFactor,
};
}
async estimateFaces(input, returnTensors = false, annotateBoxes = true) {
const image = tf.tidy(() => {
if (!(input instanceof tf.Tensor)) {
input = tf.browser.fromPixels(input);
}
return input.toFloat().expandDims(0);
});
const { boxes, scaleFactor } = await this.getBoundingBoxes(image, returnTensors, annotateBoxes);
image.dispose();
if (returnTensors) {
return boxes.map((face) => {
const scaledBox = scaleBoxFromPrediction(face, scaleFactor);
const normalizedFace = {
topLeft: scaledBox.slice([0], [2]),
bottomRight: scaledBox.slice([2], [2]),
};
if (annotateBoxes) {
const { landmarks, probability, anchor } = face;
const normalizedLandmarks = landmarks.add(anchor).mul(scaleFactor);
normalizedFace.landmarks = normalizedLandmarks;
normalizedFace.probability = probability;
}
return normalizedFace;
});
}
return Promise.all(boxes.map(async (face) => {
const scaledBox = scaleBoxFromPrediction(face, scaleFactor);
let normalizedFace;
if (!annotateBoxes) {
const boxData = await scaledBox.array();
normalizedFace = {
topLeft: boxData.slice(0, 2),
bottomRight: boxData.slice(2),
};
} else {
const [landmarkData, boxData, probabilityData] = await Promise.all([face.landmarks, scaledBox, face.probability].map(async (d) => d.array()));
const anchor = face.anchor;
const [scaleFactorX, scaleFactorY] = scaleFactor;
const scaledLandmarks = landmarkData
.map((landmark) => ([
(landmark[0] + anchor[0]) * scaleFactorX,
(landmark[1] + anchor[1]) * scaleFactorY,
]));
normalizedFace = {
topLeft: boxData.slice(0, 2),
bottomRight: boxData.slice(2),
landmarks: scaledLandmarks,
probability: probabilityData,
};
bounding.disposeBox(face.box);
face.landmarks.dispose();
face.probability.dispose();
}
scaledBox.dispose();
return normalizedFace;
}));
}
}
exports.BlazeFaceModel = BlazeFaceModel;

12
src/blazeface/index.js Normal file
View File

@ -0,0 +1,12 @@
const tf = require('@tensorflow/tfjs');
const face = require('./face');
async function load(config) {
const blazeface = await tf.loadGraphModel(config.detector.modelPath, { fromTFHub: config.detector.modelPath.includes('tfhub.dev') });
const model = new face.BlazeFaceModel(blazeface, config);
return model;
}
exports.load = load;
const face_2 = require('./face');
Object.defineProperty(exports, 'BlazeFaceModel', { enumerable: true, get() { return face_2.BlazeFaceModel; } });

58
src/config.js Normal file
View File

@ -0,0 +1,58 @@
export default {
face: {
enabled: true, // refers to detector, but since all other face modules rely on detector, it should be a global
detector: {
modelPath: '/models/blazeface/model.json',
inputSize: 128, // fixed value
maxFaces: 10, // maximum number of faces detected in the input, should be set to the minimum number for performance
skipFrames: 5, // how many frames to go without running the bounding box detector, only relevant if maxFaces > 1
minConfidence: 0.8, // threshold for discarding a prediction
iouThreshold: 0.3, // threshold for deciding whether boxes overlap too much in non-maximum suppression, must be between [0, 1]
scoreThreshold: 0.75, // threshold for deciding when to remove boxes based on score in non-maximum suppression
},
mesh: {
enabled: true,
modelPath: '/models/facemesh/model.json',
inputSize: 192, // fixed value
},
iris: {
enabled: true,
modelPath: '/models/iris/model.json',
inputSize: 192, // fixed value
},
age: {
enabled: true,
modelPath: '/models/ssrnet-age/imdb/model.json',
inputSize: 64, // fixed value
skipFrames: 5,
},
gender: {
enabled: true,
modelPath: '/models/ssrnet-gender/imdb/model.json',
},
},
body: {
enabled: true,
modelPath: '/models/posenet/model.json',
inputResolution: 257, // fixed value
outputStride: 16, // fixed value
maxDetections: 5,
scoreThreshold: 0.75,
nmsRadius: 20,
},
hand: {
enabled: true,
inputSize: 256, // fixed value
skipFrames: 5,
minConfidence: 0.8,
iouThreshold: 0.3,
scoreThreshold: 0.75,
detector: {
anchors: '/models/handdetect/anchors.json',
modelPath: '/models/handdetect/model.json',
},
skeleton: {
modelPath: '/models/handskeleton/model.json',
},
},
};

51
src/facemesh/box.js Normal file
View File

@ -0,0 +1,51 @@
const tf = require('@tensorflow/tfjs');
function scaleBoxCoordinates(box, factor) {
const startPoint = [box.startPoint[0] * factor[0], box.startPoint[1] * factor[1]];
const endPoint = [box.endPoint[0] * factor[0], box.endPoint[1] * factor[1]];
return { startPoint, endPoint };
}
exports.scaleBoxCoordinates = scaleBoxCoordinates;
function getBoxSize(box) {
return [
Math.abs(box.endPoint[0] - box.startPoint[0]),
Math.abs(box.endPoint[1] - box.startPoint[1]),
];
}
exports.getBoxSize = getBoxSize;
function getBoxCenter(box) {
return [
box.startPoint[0] + (box.endPoint[0] - box.startPoint[0]) / 2,
box.startPoint[1] + (box.endPoint[1] - box.startPoint[1]) / 2,
];
}
exports.getBoxCenter = getBoxCenter;
function cutBoxFromImageAndResize(box, image, cropSize) {
const h = image.shape[1];
const w = image.shape[2];
const boxes = [[
box.startPoint[1] / h, box.startPoint[0] / w, box.endPoint[1] / h,
box.endPoint[0] / w,
]];
return tf.image.cropAndResize(image, boxes, [0], cropSize);
}
exports.cutBoxFromImageAndResize = cutBoxFromImageAndResize;
function enlargeBox(box, factor = 1.5) {
const center = getBoxCenter(box);
const size = getBoxSize(box);
const newHalfSize = [factor * size[0] / 2, factor * size[1] / 2];
const startPoint = [center[0] - newHalfSize[0], center[1] - newHalfSize[1]];
const endPoint = [center[0] + newHalfSize[0], center[1] + newHalfSize[1]];
return { startPoint, endPoint, landmarks: box.landmarks };
}
exports.enlargeBox = enlargeBox;
function squarifyBox(box) {
const centers = getBoxCenter(box);
const size = getBoxSize(box);
const maxEdge = Math.max(...size);
const halfSize = maxEdge / 2;
const startPoint = [centers[0] - halfSize, centers[1] - halfSize];
const endPoint = [centers[0] + halfSize, centers[1] + halfSize];
return { startPoint, endPoint, landmarks: box.landmarks };
}
exports.squarifyBox = squarifyBox;

77
src/facemesh/index.js Normal file
View File

@ -0,0 +1,77 @@
const tf = require('@tensorflow/tfjs');
const blazeface = require('../blazeface');
const keypoints = require('./keypoints');
const pipe = require('./pipeline');
const uv_coords = require('./uvcoords');
exports.uv_coords = uv_coords;
async function loadDetectorModel(config) {
return blazeface.load(config);
}
async function loadMeshModel(modelUrl) {
return tf.loadGraphModel(modelUrl, { fromTFHub: modelUrl.includes('tfhub.dev') });
}
async function loadIrisModel(modelUrl) {
return tf.loadGraphModel(modelUrl, { fromTFHub: modelUrl.includes('tfhub.dev') });
}
async function load(config) {
const models = await Promise.all([
loadDetectorModel(config),
loadMeshModel(config.mesh.modelPath),
loadIrisModel(config.iris.modelPath),
]);
// eslint-disable-next-line no-use-before-define
const faceMesh = new MediaPipeFaceMesh(models[0], models[1], models[2], config);
return faceMesh;
}
exports.load = load;
class MediaPipeFaceMesh {
constructor(blazeFace, blazeMeshModel, irisModel, config) {
this.pipeline = new pipe.Pipeline(blazeFace, blazeMeshModel, irisModel, config);
this.config = config;
}
async estimateFaces(input, config) {
if (config) this.config = config;
const image = tf.tidy(() => {
if (!(input instanceof tf.Tensor)) {
input = tf.browser.fromPixels(input);
}
return input.toFloat().expandDims(0);
});
const results = [];
const predictions = await this.pipeline.predict(image, this.config.iris.enabled, this.config.mesh.enabled);
image.dispose();
if (!predictions) return results;
for (const prediction of predictions) {
const confidence = prediction.confidence.arraySync();
if (confidence >= this.config.detector.minConfidence) {
const result = {
confidence: confidence || 0,
box: prediction.box ? [prediction.box.startPoint[0], prediction.box.startPoint[1], prediction.box.endPoint[0] - prediction.box.startPoint[0], prediction.box.endPoint[1] - prediction.box.startPoint[1]] : 0,
mesh: prediction.coords ? prediction.coords.arraySync() : null,
image: prediction.image ? tf.clone(prediction.image) : null,
// mesh: prediction.coords.arraySync(),
};
const annotations = {};
if (result.mesh && result.mesh.length > 0) {
for (const key in keypoints.MESH_ANNOTATIONS) {
if (this.config.iris.enabled || key.includes('Iris') === false) {
annotations[key] = keypoints.MESH_ANNOTATIONS[key].map((index) => result.mesh[index]);
}
}
}
result['annotations'] = annotations;
results.push(result);
}
tf.dispose(prediction.confidence);
tf.dispose(prediction.image);
tf.dispose(prediction.coords);
}
return results;
}
}
exports.MediaPipeFaceMesh = MediaPipeFaceMesh;

38
src/facemesh/keypoints.js Normal file
View File

@ -0,0 +1,38 @@
exports.MESH_ANNOTATIONS = {
silhouette: [
10, 338, 297, 332, 284, 251, 389, 356, 454, 323, 361, 288,
397, 365, 379, 378, 400, 377, 152, 148, 176, 149, 150, 136,
172, 58, 132, 93, 234, 127, 162, 21, 54, 103, 67, 109,
],
lipsUpperOuter: [61, 185, 40, 39, 37, 0, 267, 269, 270, 409, 291],
lipsLowerOuter: [146, 91, 181, 84, 17, 314, 405, 321, 375, 291],
lipsUpperInner: [78, 191, 80, 81, 82, 13, 312, 311, 310, 415, 308],
lipsLowerInner: [78, 95, 88, 178, 87, 14, 317, 402, 318, 324, 308],
rightEyeUpper0: [246, 161, 160, 159, 158, 157, 173],
rightEyeLower0: [33, 7, 163, 144, 145, 153, 154, 155, 133],
rightEyeUpper1: [247, 30, 29, 27, 28, 56, 190],
rightEyeLower1: [130, 25, 110, 24, 23, 22, 26, 112, 243],
rightEyeUpper2: [113, 225, 224, 223, 222, 221, 189],
rightEyeLower2: [226, 31, 228, 229, 230, 231, 232, 233, 244],
rightEyeLower3: [143, 111, 117, 118, 119, 120, 121, 128, 245],
rightEyebrowUpper: [156, 70, 63, 105, 66, 107, 55, 193],
rightEyebrowLower: [35, 124, 46, 53, 52, 65],
rightEyeIris: [473, 474, 475, 476, 477],
leftEyeUpper0: [466, 388, 387, 386, 385, 384, 398],
leftEyeLower0: [263, 249, 390, 373, 374, 380, 381, 382, 362],
leftEyeUpper1: [467, 260, 259, 257, 258, 286, 414],
leftEyeLower1: [359, 255, 339, 254, 253, 252, 256, 341, 463],
leftEyeUpper2: [342, 445, 444, 443, 442, 441, 413],
leftEyeLower2: [446, 261, 448, 449, 450, 451, 452, 453, 464],
leftEyeLower3: [372, 340, 346, 347, 348, 349, 350, 357, 465],
leftEyebrowUpper: [383, 300, 293, 334, 296, 336, 285, 417],
leftEyebrowLower: [265, 353, 276, 283, 282, 295],
leftEyeIris: [468, 469, 470, 471, 472],
midwayBetweenEyes: [168],
noseTip: [1],
noseBottom: [2],
noseRightCorner: [98],
noseLeftCorner: [327],
rightCheek: [205],
leftCheek: [425],
};

301
src/facemesh/pipeline.js Normal file
View File

@ -0,0 +1,301 @@
/* eslint-disable class-methods-use-this */
const tf = require('@tensorflow/tfjs');
const bounding = require('./box');
const keypoints = require('./keypoints');
const util = require('./util');
const LANDMARKS_COUNT = 468;
const UPDATE_REGION_OF_INTEREST_IOU_THRESHOLD = 0.25;
const MESH_MOUTH_INDEX = 13;
const MESH_KEYPOINTS_LINE_OF_SYMMETRY_INDICES = [MESH_MOUTH_INDEX, keypoints.MESH_ANNOTATIONS['midwayBetweenEyes'][0]];
const BLAZEFACE_MOUTH_INDEX = 3;
const BLAZEFACE_NOSE_INDEX = 2;
const BLAZEFACE_KEYPOINTS_LINE_OF_SYMMETRY_INDICES = [BLAZEFACE_MOUTH_INDEX, BLAZEFACE_NOSE_INDEX];
const LEFT_EYE_OUTLINE = keypoints.MESH_ANNOTATIONS['leftEyeLower0'];
const LEFT_EYE_BOUNDS = [LEFT_EYE_OUTLINE[0], LEFT_EYE_OUTLINE[LEFT_EYE_OUTLINE.length - 1]];
const RIGHT_EYE_OUTLINE = keypoints.MESH_ANNOTATIONS['rightEyeLower0'];
const RIGHT_EYE_BOUNDS = [RIGHT_EYE_OUTLINE[0], RIGHT_EYE_OUTLINE[RIGHT_EYE_OUTLINE.length - 1]];
const IRIS_UPPER_CENTER_INDEX = 3;
const IRIS_LOWER_CENTER_INDEX = 4;
const IRIS_IRIS_INDEX = 71;
const IRIS_NUM_COORDINATES = 76;
const ENLARGE_EYE_RATIO = 2.3; // Factor by which to enlarge the box around the eye landmarks so the input region matches the expectations of the iris model.
const IRIS_MODEL_INPUT_SIZE = 64;
const MESH_TO_IRIS_INDICES_MAP = [ // A mapping from facemesh model keypoints to iris model keypoints.
{ key: 'EyeUpper0', indices: [9, 10, 11, 12, 13, 14, 15] },
{ key: 'EyeUpper1', indices: [25, 26, 27, 28, 29, 30, 31] },
{ key: 'EyeUpper2', indices: [41, 42, 43, 44, 45, 46, 47] },
{ key: 'EyeLower0', indices: [0, 1, 2, 3, 4, 5, 6, 7, 8] },
{ key: 'EyeLower1', indices: [16, 17, 18, 19, 20, 21, 22, 23, 24] },
{ key: 'EyeLower2', indices: [32, 33, 34, 35, 36, 37, 38, 39, 40] },
{ key: 'EyeLower3', indices: [54, 55, 56, 57, 58, 59, 60, 61, 62] },
{ key: 'EyebrowUpper', indices: [63, 64, 65, 66, 67, 68, 69, 70] },
{ key: 'EyebrowLower', indices: [48, 49, 50, 51, 52, 53] },
];
// Replace the raw coordinates returned by facemesh with refined iris model coordinates. Update the z coordinate to be an average of the original and the new. This produces the best visual effect.
function replaceRawCoordinates(rawCoords, newCoords, prefix, keys) {
for (let i = 0; i < MESH_TO_IRIS_INDICES_MAP.length; i++) {
const { key, indices } = MESH_TO_IRIS_INDICES_MAP[i];
const originalIndices = keypoints.MESH_ANNOTATIONS[`${prefix}${key}`];
const shouldReplaceAllKeys = keys == null;
if (shouldReplaceAllKeys || keys.includes(key)) {
for (let j = 0; j < indices.length; j++) {
const index = indices[j];
rawCoords[originalIndices[j]] = [
newCoords[index][0], newCoords[index][1],
(newCoords[index][2] + rawCoords[originalIndices[j]][2]) / 2,
];
}
}
}
}
// The Pipeline coordinates between the bounding box and skeleton models.
class Pipeline {
constructor(boundingBoxDetector, meshDetector, irisModel, config) {
// An array of facial bounding boxes.
this.regionsOfInterest = [];
this.runsWithoutFaceDetector = 0;
this.boundingBoxDetector = boundingBoxDetector;
this.meshDetector = meshDetector;
this.irisModel = irisModel;
this.meshWidth = config.mesh.inputSize;
this.meshHeight = config.mesh.inputSize;
this.skipFrames = config.detector.skipFrames;
this.maxFaces = config.detector.maxFaces;
}
transformRawCoords(rawCoords, box, angle, rotationMatrix) {
const boxSize = bounding.getBoxSize({ startPoint: box.startPoint, endPoint: box.endPoint });
const scaleFactor = [boxSize[0] / this.meshWidth, boxSize[1] / this.meshHeight];
const coordsScaled = rawCoords.map((coord) => ([
scaleFactor[0] * (coord[0] - this.meshWidth / 2),
scaleFactor[1] * (coord[1] - this.meshHeight / 2), coord[2],
]));
const coordsRotationMatrix = util.buildRotationMatrix(angle, [0, 0]);
const coordsRotated = coordsScaled.map((coord) => ([...util.rotatePoint(coord, coordsRotationMatrix), coord[2]]));
const inverseRotationMatrix = util.invertTransformMatrix(rotationMatrix);
const boxCenter = [...bounding.getBoxCenter({ startPoint: box.startPoint, endPoint: box.endPoint }), 1];
const originalBoxCenter = [
util.dot(boxCenter, inverseRotationMatrix[0]),
util.dot(boxCenter, inverseRotationMatrix[1]),
];
return coordsRotated.map((coord) => ([
coord[0] + originalBoxCenter[0],
coord[1] + originalBoxCenter[1], coord[2],
]));
}
getLeftToRightEyeDepthDifference(rawCoords) {
const leftEyeZ = rawCoords[LEFT_EYE_BOUNDS[0]][2];
const rightEyeZ = rawCoords[RIGHT_EYE_BOUNDS[0]][2];
return leftEyeZ - rightEyeZ;
}
// Returns a box describing a cropped region around the eye fit for passing to the iris model.
getEyeBox(rawCoords, face, eyeInnerCornerIndex, eyeOuterCornerIndex, flip = false) {
const box = bounding.squarifyBox(bounding.enlargeBox(this.calculateLandmarksBoundingBox([rawCoords[eyeInnerCornerIndex], rawCoords[eyeOuterCornerIndex]]), ENLARGE_EYE_RATIO));
const boxSize = bounding.getBoxSize(box);
let crop = tf.image.cropAndResize(face, [[
box.startPoint[1] / this.meshHeight,
box.startPoint[0] / this.meshWidth, box.endPoint[1] / this.meshHeight,
box.endPoint[0] / this.meshWidth,
]], [0], [IRIS_MODEL_INPUT_SIZE, IRIS_MODEL_INPUT_SIZE]);
if (flip) {
crop = tf.image.flipLeftRight(crop);
}
return { box, boxSize, crop };
}
// Given a cropped image of an eye, returns the coordinates of the contours surrounding the eye and the iris.
getEyeCoords(eyeData, eyeBox, eyeBoxSize, flip = false) {
const eyeRawCoords = [];
for (let i = 0; i < IRIS_NUM_COORDINATES; i++) {
const x = eyeData[i * 3];
const y = eyeData[i * 3 + 1];
const z = eyeData[i * 3 + 2];
eyeRawCoords.push([
(flip
? (1 - (x / IRIS_MODEL_INPUT_SIZE))
: (x / IRIS_MODEL_INPUT_SIZE)) * eyeBoxSize[0] + eyeBox.startPoint[0],
(y / IRIS_MODEL_INPUT_SIZE) * eyeBoxSize[1] + eyeBox.startPoint[1], z,
]);
}
return { rawCoords: eyeRawCoords, iris: eyeRawCoords.slice(IRIS_IRIS_INDEX) };
}
// The z-coordinates returned for the iris are unreliable, so we take the z values from the surrounding keypoints.
getAdjustedIrisCoords(rawCoords, irisCoords, direction) {
const upperCenterZ = rawCoords[keypoints.MESH_ANNOTATIONS[`${direction}EyeUpper0`][IRIS_UPPER_CENTER_INDEX]][2];
const lowerCenterZ = rawCoords[keypoints.MESH_ANNOTATIONS[`${direction}EyeLower0`][IRIS_LOWER_CENTER_INDEX]][2];
const averageZ = (upperCenterZ + lowerCenterZ) / 2;
// Iris indices: 0: center | 1: right | 2: above | 3: left | 4: below
return irisCoords.map((coord, i) => {
let z = averageZ;
if (i === 2) {
z = upperCenterZ;
} else if (i === 4) {
z = lowerCenterZ;
}
return [coord[0], coord[1], z];
});
}
async predict(input, predictIrises, predictMesh) {
if (this.shouldUpdateRegionsOfInterest()) {
const returnTensors = false;
const annotateFace = true;
const { boxes, scaleFactor } = await this.boundingBoxDetector.getBoundingBoxes(input, returnTensors, annotateFace);
if (boxes.length === 0) {
this.regionsOfInterest = [];
return null;
}
const scaledBoxes = boxes.map((prediction) => {
const predictionBoxCPU = {
startPoint: prediction.box.startPoint.squeeze().arraySync(),
endPoint: prediction.box.endPoint.squeeze().arraySync(),
};
const scaledBox = bounding.scaleBoxCoordinates(predictionBoxCPU, scaleFactor);
const enlargedBox = bounding.enlargeBox(scaledBox);
return {
...enlargedBox,
landmarks: prediction.landmarks.arraySync(),
};
});
boxes.forEach((box) => {
if (box != null && box.startPoint != null) {
box.startEndTensor.dispose();
box.startPoint.dispose();
box.endPoint.dispose();
}
});
this.updateRegionsOfInterest(scaledBoxes);
this.runsWithoutFaceDetector = 0;
} else {
this.runsWithoutFaceDetector++;
}
return tf.tidy(() => this.regionsOfInterest.map((box, i) => {
let angle = 0;
// The facial bounding box landmarks could come either from blazeface (if we are using a fresh box), or from the mesh model (if we are reusing an old box).
const boxLandmarksFromMeshModel = box.landmarks.length >= LANDMARKS_COUNT;
let [indexOfMouth, indexOfForehead] = MESH_KEYPOINTS_LINE_OF_SYMMETRY_INDICES;
if (boxLandmarksFromMeshModel === false) {
[indexOfMouth, indexOfForehead] = BLAZEFACE_KEYPOINTS_LINE_OF_SYMMETRY_INDICES;
}
angle = util.computeRotation(box.landmarks[indexOfMouth], box.landmarks[indexOfForehead]);
const faceCenter = bounding.getBoxCenter({ startPoint: box.startPoint, endPoint: box.endPoint });
const faceCenterNormalized = [faceCenter[0] / input.shape[2], faceCenter[1] / input.shape[1]];
let rotatedImage = input;
let rotationMatrix = util.IDENTITY_MATRIX;
if (angle !== 0) {
rotatedImage = tf.image.rotateWithOffset(input, angle, 0, faceCenterNormalized);
rotationMatrix = util.buildRotationMatrix(-angle, faceCenter);
}
const boxCPU = { startPoint: box.startPoint, endPoint: box.endPoint };
const face = bounding.cutBoxFromImageAndResize(boxCPU, rotatedImage, [this.meshHeight, this.meshWidth]).div(255);
// The first returned tensor represents facial contours, which are included in the coordinates.
const [, flag, coords] = this.meshDetector.predict(face);
const coordsReshaped = tf.reshape(coords, [-1, 3]);
let rawCoords = coordsReshaped.arraySync();
if (predictIrises) {
const { box: leftEyeBox, boxSize: leftEyeBoxSize, crop: leftEyeCrop } = this.getEyeBox(rawCoords, face, LEFT_EYE_BOUNDS[0], LEFT_EYE_BOUNDS[1], true);
const { box: rightEyeBox, boxSize: rightEyeBoxSize, crop: rightEyeCrop } = this.getEyeBox(rawCoords, face, RIGHT_EYE_BOUNDS[0], RIGHT_EYE_BOUNDS[1]);
const eyePredictions = (this.irisModel.predict(tf.concat([leftEyeCrop, rightEyeCrop])));
const eyePredictionsData = eyePredictions.dataSync();
const leftEyeData = eyePredictionsData.slice(0, IRIS_NUM_COORDINATES * 3);
const { rawCoords: leftEyeRawCoords, iris: leftIrisRawCoords } = this.getEyeCoords(leftEyeData, leftEyeBox, leftEyeBoxSize, true);
const rightEyeData = eyePredictionsData.slice(IRIS_NUM_COORDINATES * 3);
const { rawCoords: rightEyeRawCoords, iris: rightIrisRawCoords } = this.getEyeCoords(rightEyeData, rightEyeBox, rightEyeBoxSize);
const leftToRightEyeDepthDifference = this.getLeftToRightEyeDepthDifference(rawCoords);
if (Math.abs(leftToRightEyeDepthDifference) < 30) { // User is looking straight ahead.
replaceRawCoordinates(rawCoords, leftEyeRawCoords, 'left');
replaceRawCoordinates(rawCoords, rightEyeRawCoords, 'right');
// If the user is looking to the left or to the right, the iris coordinates tend to diverge too much from the mesh coordinates for them to be merged. So we only update a single contour line above and below the eye.
} else if (leftToRightEyeDepthDifference < 1) { // User is looking towards the right.
replaceRawCoordinates(rawCoords, leftEyeRawCoords, 'left', ['EyeUpper0', 'EyeLower0']);
} else { // User is looking towards the left.
replaceRawCoordinates(rawCoords, rightEyeRawCoords, 'right', ['EyeUpper0', 'EyeLower0']);
}
const adjustedLeftIrisCoords = this.getAdjustedIrisCoords(rawCoords, leftIrisRawCoords, 'left');
const adjustedRightIrisCoords = this.getAdjustedIrisCoords(rawCoords, rightIrisRawCoords, 'right');
rawCoords = rawCoords.concat(adjustedLeftIrisCoords).concat(adjustedRightIrisCoords);
}
const transformedCoordsData = this.transformRawCoords(rawCoords, box, angle, rotationMatrix);
tf.dispose(rawCoords);
const landmarksBox = bounding.enlargeBox(this.calculateLandmarksBoundingBox(transformedCoordsData));
if (predictMesh) {
const transformedCoords = tf.tensor2d(transformedCoordsData);
this.regionsOfInterest[i] = { ...landmarksBox, landmarks: transformedCoords.arraySync() };
const prediction = {
// coords: tf.tensor2d(rawCoords, [rawCoords.length, 3]),
coords: transformedCoords,
box: landmarksBox,
confidence: flag.squeeze(),
image: face,
};
return prediction;
}
const prediction = {
coords: null,
// scaledCoords: null,
box: landmarksBox,
confidence: flag.squeeze(),
image: face,
};
return prediction;
}));
}
// Updates regions of interest if the intersection over union between the incoming and previous regions falls below a threshold.
updateRegionsOfInterest(boxes) {
for (let i = 0; i < boxes.length; i++) {
const box = boxes[i];
const previousBox = this.regionsOfInterest[i];
let iou = 0;
if (previousBox && previousBox.startPoint) {
const [boxStartX, boxStartY] = box.startPoint;
const [boxEndX, boxEndY] = box.endPoint;
const [previousBoxStartX, previousBoxStartY] = previousBox.startPoint;
const [previousBoxEndX, previousBoxEndY] = previousBox.endPoint;
const xStartMax = Math.max(boxStartX, previousBoxStartX);
const yStartMax = Math.max(boxStartY, previousBoxStartY);
const xEndMin = Math.min(boxEndX, previousBoxEndX);
const yEndMin = Math.min(boxEndY, previousBoxEndY);
const intersection = (xEndMin - xStartMax) * (yEndMin - yStartMax);
const boxArea = (boxEndX - boxStartX) * (boxEndY - boxStartY);
const previousBoxArea = (previousBoxEndX - previousBoxStartX) * (previousBoxEndY - boxStartY);
iou = intersection / (boxArea + previousBoxArea - intersection);
}
if (iou < UPDATE_REGION_OF_INTEREST_IOU_THRESHOLD) {
this.regionsOfInterest[i] = box;
}
}
this.regionsOfInterest = this.regionsOfInterest.slice(0, boxes.length);
}
clearRegionOfInterest(index) {
if (this.regionsOfInterest[index] != null) {
this.regionsOfInterest = [
...this.regionsOfInterest.slice(0, index),
...this.regionsOfInterest.slice(index + 1),
];
}
}
shouldUpdateRegionsOfInterest() {
const roisCount = this.regionsOfInterest.length;
const noROIs = roisCount === 0;
if (this.maxFaces === 1 || noROIs) {
return noROIs;
}
return roisCount !== this.maxFaces && this.runsWithoutFaceDetector >= this.skipFrames;
}
calculateLandmarksBoundingBox(landmarks) {
const xs = landmarks.map((d) => d[0]);
const ys = landmarks.map((d) => d[1]);
const startPoint = [Math.min(...xs), Math.min(...ys)];
const endPoint = [Math.max(...xs), Math.max(...ys)];
return { startPoint, endPoint };
}
}
exports.Pipeline = Pipeline;

88
src/facemesh/util.js Normal file
View File

@ -0,0 +1,88 @@
exports.IDENTITY_MATRIX = [[1, 0, 0], [0, 1, 0], [0, 0, 1]];
/**
* Normalizes the provided angle to the range -pi to pi.
* @param angle The angle in radians to be normalized.
*/
function normalizeRadians(angle) {
return angle - 2 * Math.PI * Math.floor((angle + Math.PI) / (2 * Math.PI));
}
exports.normalizeRadians = normalizeRadians;
/**
* Computes the angle of rotation between two anchor points.
* @param point1 First anchor point
* @param point2 Second anchor point
*/
function computeRotation(point1, point2) {
const radians = Math.PI / 2 - Math.atan2(-(point2[1] - point1[1]), point2[0] - point1[0]);
return normalizeRadians(radians);
}
exports.computeRotation = computeRotation;
function radToDegrees(rad) {
return rad * 180 / Math.PI;
}
exports.radToDegrees = radToDegrees;
function buildTranslationMatrix(x, y) {
return [[1, 0, x], [0, 1, y], [0, 0, 1]];
}
function dot(v1, v2) {
let product = 0;
for (let i = 0; i < v1.length; i++) {
product += v1[i] * v2[i];
}
return product;
}
exports.dot = dot;
function getColumnFrom2DArr(arr, columnIndex) {
const column = [];
for (let i = 0; i < arr.length; i++) {
column.push(arr[i][columnIndex]);
}
return column;
}
exports.getColumnFrom2DArr = getColumnFrom2DArr;
function multiplyTransformMatrices(mat1, mat2) {
const product = [];
const size = mat1.length;
for (let row = 0; row < size; row++) {
product.push([]);
for (let col = 0; col < size; col++) {
product[row].push(dot(mat1[row], getColumnFrom2DArr(mat2, col)));
}
}
return product;
}
function buildRotationMatrix(rotation, center) {
const cosA = Math.cos(rotation);
const sinA = Math.sin(rotation);
const rotationMatrix = [[cosA, -sinA, 0], [sinA, cosA, 0], [0, 0, 1]];
const translationMatrix = buildTranslationMatrix(center[0], center[1]);
const translationTimesRotation = multiplyTransformMatrices(translationMatrix, rotationMatrix);
const negativeTranslationMatrix = buildTranslationMatrix(-center[0], -center[1]);
return multiplyTransformMatrices(translationTimesRotation, negativeTranslationMatrix);
}
exports.buildRotationMatrix = buildRotationMatrix;
function invertTransformMatrix(matrix) {
const rotationComponent = [[matrix[0][0], matrix[1][0]], [matrix[0][1], matrix[1][1]]];
const translationComponent = [matrix[0][2], matrix[1][2]];
const invertedTranslation = [
-dot(rotationComponent[0], translationComponent),
-dot(rotationComponent[1], translationComponent),
];
return [
rotationComponent[0].concat(invertedTranslation[0]),
rotationComponent[1].concat(invertedTranslation[1]),
[0, 0, 1],
];
}
exports.invertTransformMatrix = invertTransformMatrix;
function rotatePoint(homogeneousCoordinate, rotationMatrix) {
return [
dot(homogeneousCoordinate, rotationMatrix[0]),
dot(homogeneousCoordinate, rotationMatrix[1]),
];
}
exports.rotatePoint = rotatePoint;
function xyDistanceBetweenPoints(a, b) {
return Math.sqrt(((a[0] - b[0]) ** 2) + ((a[1] - b[1]) ** 2));
}
exports.xyDistanceBetweenPoints = xyDistanceBetweenPoints;

470
src/facemesh/uvcoords.js Normal file
View File

@ -0,0 +1,470 @@
exports.UV_COORDS = [
[0.499976992607117, 0.652534008026123],
[0.500025987625122, 0.547487020492554],
[0.499974012374878, 0.602371990680695],
[0.482113003730774, 0.471979022026062],
[0.500150978565216, 0.527155995368958],
[0.499909996986389, 0.498252987861633],
[0.499523013830185, 0.40106201171875],
[0.289712011814117, 0.380764007568359],
[0.499954998493195, 0.312398016452789],
[0.499987006187439, 0.269918978214264],
[0.500023007392883, 0.107050001621246],
[0.500023007392883, 0.666234016418457],
[0.5000159740448, 0.679224014282227],
[0.500023007392883, 0.692348003387451],
[0.499976992607117, 0.695277988910675],
[0.499976992607117, 0.70593398809433],
[0.499976992607117, 0.719385027885437],
[0.499976992607117, 0.737019002437592],
[0.499967992305756, 0.781370997428894],
[0.499816000461578, 0.562981009483337],
[0.473773002624512, 0.573909997940063],
[0.104906998574734, 0.254140973091125],
[0.365929991006851, 0.409575998783112],
[0.338757991790771, 0.41302502155304],
[0.311120003461838, 0.409460008144379],
[0.274657994508743, 0.389131009578705],
[0.393361985683441, 0.403706014156342],
[0.345234006643295, 0.344011008739471],
[0.370094001293182, 0.346076011657715],
[0.319321990013123, 0.347265005111694],
[0.297903001308441, 0.353591024875641],
[0.24779200553894, 0.410809993743896],
[0.396889001131058, 0.842755019664764],
[0.280097991228104, 0.375599980354309],
[0.106310002505779, 0.399955987930298],
[0.2099249958992, 0.391353011131287],
[0.355807989835739, 0.534406006336212],
[0.471751004457474, 0.65040397644043],
[0.474155008792877, 0.680191993713379],
[0.439785003662109, 0.657229006290436],
[0.414617002010345, 0.66654098033905],
[0.450374007225037, 0.680860996246338],
[0.428770989179611, 0.682690978050232],
[0.374971002340317, 0.727805018424988],
[0.486716985702515, 0.547628998756409],
[0.485300987958908, 0.527395009994507],
[0.257764995098114, 0.314490020275116],
[0.401223003864288, 0.455172002315521],
[0.429818987846375, 0.548614978790283],
[0.421351999044418, 0.533740997314453],
[0.276895999908447, 0.532056987285614],
[0.483370006084442, 0.499586999416351],
[0.33721199631691, 0.282882988452911],
[0.296391993761063, 0.293242990970612],
[0.169294998049736, 0.193813979625702],
[0.447580009698868, 0.302609980106354],
[0.392390012741089, 0.353887975215912],
[0.354490011930466, 0.696784019470215],
[0.067304998636246, 0.730105042457581],
[0.442739009857178, 0.572826027870178],
[0.457098007202148, 0.584792017936707],
[0.381974011659622, 0.694710969924927],
[0.392388999462128, 0.694203019142151],
[0.277076005935669, 0.271932005882263],
[0.422551989555359, 0.563233017921448],
[0.385919004678726, 0.281364023685455],
[0.383103013038635, 0.255840003490448],
[0.331431001424789, 0.119714021682739],
[0.229923993349075, 0.232002973556519],
[0.364500999450684, 0.189113974571228],
[0.229622006416321, 0.299540996551514],
[0.173287004232407, 0.278747975826263],
[0.472878992557526, 0.666198015213013],
[0.446828007698059, 0.668527007102966],
[0.422762006521225, 0.673889994621277],
[0.445307999849319, 0.580065965652466],
[0.388103008270264, 0.693961024284363],
[0.403039008378983, 0.706539988517761],
[0.403629004955292, 0.693953037261963],
[0.460041999816895, 0.557139039039612],
[0.431158006191254, 0.692366003990173],
[0.452181994915009, 0.692366003990173],
[0.475387006998062, 0.692366003990173],
[0.465828001499176, 0.779190003871918],
[0.472328990697861, 0.736225962638855],
[0.473087012767792, 0.717857003211975],
[0.473122000694275, 0.704625964164734],
[0.473033010959625, 0.695277988910675],
[0.427942007780075, 0.695277988910675],
[0.426479011774063, 0.703539967536926],
[0.423162013292313, 0.711845993995667],
[0.4183090031147, 0.720062971115112],
[0.390094995498657, 0.639572978019714],
[0.013953999616206, 0.560034036636353],
[0.499913990497589, 0.58014702796936],
[0.413199990987778, 0.69539999961853],
[0.409626007080078, 0.701822996139526],
[0.468080013990402, 0.601534962654114],
[0.422728985548019, 0.585985004901886],
[0.463079988956451, 0.593783974647522],
[0.37211999297142, 0.47341400384903],
[0.334562003612518, 0.496073007583618],
[0.411671012639999, 0.546965003013611],
[0.242175996303558, 0.14767599105835],
[0.290776997804642, 0.201445996761322],
[0.327338010072708, 0.256527006626129],
[0.399509996175766, 0.748921036720276],
[0.441727995872498, 0.261676013469696],
[0.429764986038208, 0.187834024429321],
[0.412198007106781, 0.108901023864746],
[0.288955003023148, 0.398952007293701],
[0.218936994671822, 0.435410976409912],
[0.41278201341629, 0.398970007896423],
[0.257135003805161, 0.355440020561218],
[0.427684992551804, 0.437960982322693],
[0.448339998722076, 0.536936044692993],
[0.178560003638268, 0.45755398273468],
[0.247308000922203, 0.457193970680237],
[0.286267012357712, 0.467674970626831],
[0.332827985286713, 0.460712015628815],
[0.368755996227264, 0.447206974029541],
[0.398963987827301, 0.432654976844788],
[0.476410001516342, 0.405806005001068],
[0.189241006970406, 0.523923993110657],
[0.228962004184723, 0.348950982093811],
[0.490725994110107, 0.562400996685028],
[0.404670000076294, 0.485132992267609],
[0.019469000399113, 0.401564002037048],
[0.426243007183075, 0.420431017875671],
[0.396993011236191, 0.548797011375427],
[0.266469985246658, 0.376977026462555],
[0.439121007919312, 0.51895797252655],
[0.032313998788595, 0.644356966018677],
[0.419054001569748, 0.387154996395111],
[0.462783008813858, 0.505746960639954],
[0.238978996872902, 0.779744982719421],
[0.198220998048782, 0.831938028335571],
[0.107550002634525, 0.540755033493042],
[0.183610007166862, 0.740257024765015],
[0.134409993886948, 0.333683013916016],
[0.385764002799988, 0.883153975009918],
[0.490967005491257, 0.579378008842468],
[0.382384985685349, 0.508572995662689],
[0.174399003386497, 0.397670984268188],
[0.318785011768341, 0.39623498916626],
[0.343364000320435, 0.400596976280212],
[0.396100014448166, 0.710216999053955],
[0.187885001301765, 0.588537991046906],
[0.430987000465393, 0.944064974784851],
[0.318993002176285, 0.898285031318665],
[0.266247987747192, 0.869701027870178],
[0.500023007392883, 0.190576016902924],
[0.499976992607117, 0.954452991485596],
[0.366169989109039, 0.398822009563446],
[0.393207013607025, 0.39553701877594],
[0.410373002290726, 0.391080021858215],
[0.194993004202843, 0.342101991176605],
[0.388664990663528, 0.362284004688263],
[0.365961998701096, 0.355970978736877],
[0.343364000320435, 0.355356991291046],
[0.318785011768341, 0.35834002494812],
[0.301414996385574, 0.363156020641327],
[0.058132998645306, 0.319076001644135],
[0.301414996385574, 0.387449026107788],
[0.499987989664078, 0.618434011936188],
[0.415838003158569, 0.624195992946625],
[0.445681989192963, 0.566076993942261],
[0.465844005346298, 0.620640993118286],
[0.49992299079895, 0.351523995399475],
[0.288718998432159, 0.819945991039276],
[0.335278987884521, 0.852819979190826],
[0.440512001514435, 0.902418971061707],
[0.128294005990028, 0.791940987110138],
[0.408771991729736, 0.373893976211548],
[0.455606997013092, 0.451801002025604],
[0.499877005815506, 0.908990025520325],
[0.375436991453171, 0.924192011356354],
[0.11421000212431, 0.615022003650665],
[0.448662012815475, 0.695277988910675],
[0.4480200111866, 0.704632043838501],
[0.447111994028091, 0.715808033943176],
[0.444831997156143, 0.730794012546539],
[0.430011987686157, 0.766808986663818],
[0.406787008047104, 0.685672998428345],
[0.400738000869751, 0.681069016456604],
[0.392399996519089, 0.677703022956848],
[0.367855995893478, 0.663918972015381],
[0.247923001646996, 0.601333022117615],
[0.452769994735718, 0.420849978923798],
[0.43639200925827, 0.359887003898621],
[0.416164010763168, 0.368713974952698],
[0.413385987281799, 0.692366003990173],
[0.228018000721931, 0.683571994304657],
[0.468268007040024, 0.352671027183533],
[0.411361992359161, 0.804327011108398],
[0.499989002943039, 0.469825029373169],
[0.479153990745544, 0.442654013633728],
[0.499974012374878, 0.439637005329132],
[0.432112008333206, 0.493588984012604],
[0.499886006116867, 0.866917014122009],
[0.49991300702095, 0.821729004383087],
[0.456548988819122, 0.819200992584229],
[0.344549000263214, 0.745438992977142],
[0.37890899181366, 0.574010014533997],
[0.374292999505997, 0.780184984207153],
[0.319687992334366, 0.570737957954407],
[0.357154995203018, 0.604269981384277],
[0.295284003019333, 0.621580958366394],
[0.447750002145767, 0.862477004528046],
[0.410986006259918, 0.508723020553589],
[0.31395098567009, 0.775308012962341],
[0.354128003120422, 0.812552988529205],
[0.324548006057739, 0.703992962837219],
[0.189096003770828, 0.646299958229065],
[0.279776990413666, 0.71465802192688],
[0.1338230073452, 0.682700991630554],
[0.336768001317978, 0.644733011722565],
[0.429883986711502, 0.466521978378296],
[0.455527991056442, 0.548622965812683],
[0.437114000320435, 0.558896005153656],
[0.467287987470627, 0.529924988746643],
[0.414712011814117, 0.335219979286194],
[0.37704598903656, 0.322777986526489],
[0.344107985496521, 0.320150971412659],
[0.312875986099243, 0.32233202457428],
[0.283526003360748, 0.333190023899078],
[0.241245999932289, 0.382785975933075],
[0.102986000478268, 0.468762993812561],
[0.267612010240555, 0.424560010433197],
[0.297879010438919, 0.433175981044769],
[0.333433985710144, 0.433878004550934],
[0.366427004337311, 0.426115989685059],
[0.396012008190155, 0.416696012020111],
[0.420121014118195, 0.41022801399231],
[0.007561000064015, 0.480777025222778],
[0.432949006557465, 0.569517970085144],
[0.458638995885849, 0.479089021682739],
[0.473466008901596, 0.545744001865387],
[0.476087987422943, 0.563830018043518],
[0.468472003936768, 0.555056989192963],
[0.433990985155106, 0.582361996173859],
[0.483518004417419, 0.562983989715576],
[0.482482999563217, 0.57784903049469],
[0.42645001411438, 0.389798998832703],
[0.438998997211456, 0.39649498462677],
[0.450067013502121, 0.400434017181396],
[0.289712011814117, 0.368252992630005],
[0.276670008897781, 0.363372981548309],
[0.517862021923065, 0.471948027610779],
[0.710287988185883, 0.380764007568359],
[0.526226997375488, 0.573909997940063],
[0.895093023777008, 0.254140973091125],
[0.634069979190826, 0.409575998783112],
[0.661242008209229, 0.41302502155304],
[0.688880026340485, 0.409460008144379],
[0.725341975688934, 0.389131009578705],
[0.606630027294159, 0.40370500087738],
[0.654766023159027, 0.344011008739471],
[0.629905998706818, 0.346076011657715],
[0.680678009986877, 0.347265005111694],
[0.702096998691559, 0.353591024875641],
[0.75221198797226, 0.410804986953735],
[0.602918028831482, 0.842862963676453],
[0.719901978969574, 0.375599980354309],
[0.893692970275879, 0.399959981441498],
[0.790081977844238, 0.391354024410248],
[0.643998026847839, 0.534487962722778],
[0.528249025344849, 0.65040397644043],
[0.525849997997284, 0.680191040039062],
[0.560214996337891, 0.657229006290436],
[0.585384011268616, 0.66654098033905],
[0.549625992774963, 0.680860996246338],
[0.57122802734375, 0.682691991329193],
[0.624852001667023, 0.72809898853302],
[0.513050019741058, 0.547281980514526],
[0.51509702205658, 0.527251958847046],
[0.742246985435486, 0.314507007598877],
[0.598631024360657, 0.454979002475739],
[0.570338010787964, 0.548575043678284],
[0.578631997108459, 0.533622980117798],
[0.723087012767792, 0.532054007053375],
[0.516445994377136, 0.499638974666595],
[0.662801027297974, 0.282917976379395],
[0.70362401008606, 0.293271005153656],
[0.830704987049103, 0.193813979625702],
[0.552385985851288, 0.302568018436432],
[0.607609987258911, 0.353887975215912],
[0.645429015159607, 0.696707010269165],
[0.932694971561432, 0.730105042457581],
[0.557260990142822, 0.572826027870178],
[0.542901992797852, 0.584792017936707],
[0.6180260181427, 0.694710969924927],
[0.607590973377228, 0.694203019142151],
[0.722943007946014, 0.271963000297546],
[0.577413976192474, 0.563166975975037],
[0.614082992076874, 0.281386971473694],
[0.616907000541687, 0.255886018276215],
[0.668509006500244, 0.119913995265961],
[0.770092010498047, 0.232020974159241],
[0.635536015033722, 0.189248979091644],
[0.77039098739624, 0.299556016921997],
[0.826722025871277, 0.278755009174347],
[0.527121007442474, 0.666198015213013],
[0.553171992301941, 0.668527007102966],
[0.577238023281097, 0.673889994621277],
[0.554691970348358, 0.580065965652466],
[0.611896991729736, 0.693961024284363],
[0.59696102142334, 0.706539988517761],
[0.596370995044708, 0.693953037261963],
[0.539958000183105, 0.557139039039612],
[0.568841993808746, 0.692366003990173],
[0.547818005084991, 0.692366003990173],
[0.52461302280426, 0.692366003990173],
[0.534089982509613, 0.779141008853912],
[0.527670979499817, 0.736225962638855],
[0.526912987232208, 0.717857003211975],
[0.526877999305725, 0.704625964164734],
[0.526966989040375, 0.695277988910675],
[0.572058022022247, 0.695277988910675],
[0.573521018028259, 0.703539967536926],
[0.57683801651001, 0.711845993995667],
[0.581691026687622, 0.720062971115112],
[0.609944999217987, 0.639909982681274],
[0.986046016216278, 0.560034036636353],
[0.5867999792099, 0.69539999961853],
[0.590372025966644, 0.701822996139526],
[0.531915009021759, 0.601536989212036],
[0.577268004417419, 0.585934996604919],
[0.536915004253387, 0.593786001205444],
[0.627542972564697, 0.473352015018463],
[0.665585994720459, 0.495950996875763],
[0.588353991508484, 0.546862006187439],
[0.757824003696442, 0.14767599105835],
[0.709249973297119, 0.201507985591888],
[0.672684013843536, 0.256581008434296],
[0.600408971309662, 0.74900496006012],
[0.55826598405838, 0.261672019958496],
[0.570303976535797, 0.187870979309082],
[0.588165998458862, 0.109044015407562],
[0.711045026779175, 0.398952007293701],
[0.781069993972778, 0.435405015945435],
[0.587247014045715, 0.398931980133057],
[0.742869973182678, 0.355445981025696],
[0.572156012058258, 0.437651991844177],
[0.55186802148819, 0.536570012569427],
[0.821442008018494, 0.457556009292603],
[0.752701997756958, 0.457181990146637],
[0.71375697851181, 0.467626988887787],
[0.66711300611496, 0.460672974586487],
[0.631101012229919, 0.447153985500336],
[0.6008620262146, 0.432473003864288],
[0.523481011390686, 0.405627012252808],
[0.810747981071472, 0.523926019668579],
[0.771045982837677, 0.348959028720856],
[0.509127020835876, 0.562718033790588],
[0.595292985439301, 0.485023975372314],
[0.980530977249146, 0.401564002037048],
[0.573499977588654, 0.420000016689301],
[0.602994978427887, 0.548687994480133],
[0.733529984951019, 0.376977026462555],
[0.560611009597778, 0.519016981124878],
[0.967685997486115, 0.644356966018677],
[0.580985009670258, 0.387160003185272],
[0.537728011608124, 0.505385041236877],
[0.760966002941132, 0.779752969741821],
[0.801778972148895, 0.831938028335571],
[0.892440974712372, 0.54076099395752],
[0.816350996494293, 0.740260004997253],
[0.865594983100891, 0.333687007427216],
[0.614073991775513, 0.883246004581451],
[0.508952975273132, 0.579437971115112],
[0.617941975593567, 0.508316040039062],
[0.825608015060425, 0.397674977779388],
[0.681214988231659, 0.39623498916626],
[0.656635999679565, 0.400596976280212],
[0.603900015354156, 0.710216999053955],
[0.81208598613739, 0.588539004325867],
[0.56801301240921, 0.944564998149872],
[0.681007981300354, 0.898285031318665],
[0.733752012252808, 0.869701027870178],
[0.633830010890961, 0.398822009563446],
[0.606792986392975, 0.39553701877594],
[0.589659988880157, 0.391062021255493],
[0.805015981197357, 0.342108011245728],
[0.611334979534149, 0.362284004688263],
[0.634037971496582, 0.355970978736877],
[0.656635999679565, 0.355356991291046],
[0.681214988231659, 0.35834002494812],
[0.698584973812103, 0.363156020641327],
[0.941866993904114, 0.319076001644135],
[0.698584973812103, 0.387449026107788],
[0.584177017211914, 0.624107003211975],
[0.554318010807037, 0.566076993942261],
[0.534153997898102, 0.62064003944397],
[0.711217999458313, 0.819975018501282],
[0.664629995822906, 0.852871000766754],
[0.559099972248077, 0.902631998062134],
[0.871706008911133, 0.791940987110138],
[0.591234028339386, 0.373893976211548],
[0.544341027736664, 0.451583981513977],
[0.624562978744507, 0.924192011356354],
[0.88577002286911, 0.615028977394104],
[0.551338016986847, 0.695277988910675],
[0.551980018615723, 0.704632043838501],
[0.552887976169586, 0.715808033943176],
[0.555167973041534, 0.730794012546539],
[0.569944024085999, 0.767035007476807],
[0.593203008174896, 0.685675978660583],
[0.599261999130249, 0.681069016456604],
[0.607599973678589, 0.677703022956848],
[0.631937980651855, 0.663500010967255],
[0.752032995223999, 0.601315021514893],
[0.547226011753082, 0.420395016670227],
[0.563543975353241, 0.359827995300293],
[0.583841025829315, 0.368713974952698],
[0.586614012718201, 0.692366003990173],
[0.771915018558502, 0.683578014373779],
[0.531597018241882, 0.352482974529266],
[0.588370978832245, 0.804440975189209],
[0.52079701423645, 0.442565023899078],
[0.567984998226166, 0.493479013442993],
[0.543282985687256, 0.819254994392395],
[0.655317008495331, 0.745514988899231],
[0.621008992195129, 0.574018001556396],
[0.625559985637665, 0.78031200170517],
[0.680198013782501, 0.570719003677368],
[0.64276397228241, 0.604337990283966],
[0.704662978649139, 0.621529996395111],
[0.552012026309967, 0.862591981887817],
[0.589071989059448, 0.508637011051178],
[0.685944974422455, 0.775357007980347],
[0.645735025405884, 0.812640011310577],
[0.675342977046967, 0.703978002071381],
[0.810858011245728, 0.646304965019226],
[0.72012197971344, 0.714666962623596],
[0.866151988506317, 0.682704985141754],
[0.663187026977539, 0.644596993923187],
[0.570082008838654, 0.466325998306274],
[0.544561982154846, 0.548375964164734],
[0.562758982181549, 0.558784961700439],
[0.531987011432648, 0.530140042304993],
[0.585271000862122, 0.335177004337311],
[0.622952997684479, 0.32277899980545],
[0.655896008014679, 0.320163011550903],
[0.687132000923157, 0.322345972061157],
[0.716481983661652, 0.333200991153717],
[0.758756995201111, 0.382786989212036],
[0.897013008594513, 0.468769013881683],
[0.732392013072968, 0.424547016620636],
[0.70211398601532, 0.433162987232208],
[0.66652500629425, 0.433866024017334],
[0.633504986763, 0.426087975502014],
[0.603875994682312, 0.416586995124817],
[0.579657971858978, 0.409945011138916],
[0.992439985275269, 0.480777025222778],
[0.567192018032074, 0.569419980049133],
[0.54136598110199, 0.478899002075195],
[0.526564002037048, 0.546118021011353],
[0.523913025856018, 0.563830018043518],
[0.531529009342194, 0.555056989192963],
[0.566035985946655, 0.582329034805298],
[0.51631098985672, 0.563053965568542],
[0.5174720287323, 0.577877044677734],
[0.573594987392426, 0.389806985855103],
[0.560697972774506, 0.395331978797913],
[0.549755990505219, 0.399751007556915],
[0.710287988185883, 0.368252992630005],
[0.723330020904541, 0.363372981548309],
];

65
src/handpose/box.js Normal file
View File

@ -0,0 +1,65 @@
const tf = require('@tensorflow/tfjs');
function getBoxSize(box) {
return [
Math.abs(box.endPoint[0] - box.startPoint[0]),
Math.abs(box.endPoint[1] - box.startPoint[1]),
];
}
exports.getBoxSize = getBoxSize;
function getBoxCenter(box) {
return [
box.startPoint[0] + (box.endPoint[0] - box.startPoint[0]) / 2,
box.startPoint[1] + (box.endPoint[1] - box.startPoint[1]) / 2,
];
}
exports.getBoxCenter = getBoxCenter;
function cutBoxFromImageAndResize(box, image, cropSize) {
const h = image.shape[1];
const w = image.shape[2];
const boxes = [[
box.startPoint[1] / h, box.startPoint[0] / w, box.endPoint[1] / h,
box.endPoint[0] / w,
]];
return tf.image.cropAndResize(image, boxes, [0], cropSize);
}
exports.cutBoxFromImageAndResize = cutBoxFromImageAndResize;
function scaleBoxCoordinates(box, factor) {
const startPoint = [box.startPoint[0] * factor[0], box.startPoint[1] * factor[1]];
const endPoint = [box.endPoint[0] * factor[0], box.endPoint[1] * factor[1]];
const palmLandmarks = box.palmLandmarks.map((coord) => {
const scaledCoord = [coord[0] * factor[0], coord[1] * factor[1]];
return scaledCoord;
});
return { startPoint, endPoint, palmLandmarks };
}
exports.scaleBoxCoordinates = scaleBoxCoordinates;
function enlargeBox(box, factor = 1.5) {
const center = getBoxCenter(box);
const size = getBoxSize(box);
const newHalfSize = [factor * size[0] / 2, factor * size[1] / 2];
const startPoint = [center[0] - newHalfSize[0], center[1] - newHalfSize[1]];
const endPoint = [center[0] + newHalfSize[0], center[1] + newHalfSize[1]];
return { startPoint, endPoint, palmLandmarks: box.palmLandmarks };
}
exports.enlargeBox = enlargeBox;
function squarifyBox(box) {
const centers = getBoxCenter(box);
const size = getBoxSize(box);
const maxEdge = Math.max(...size);
const halfSize = maxEdge / 2;
const startPoint = [centers[0] - halfSize, centers[1] - halfSize];
const endPoint = [centers[0] + halfSize, centers[1] + halfSize];
return { startPoint, endPoint, palmLandmarks: box.palmLandmarks };
}
exports.squarifyBox = squarifyBox;
function shiftBox(box, shiftFactor) {
const boxSize = [
box.endPoint[0] - box.startPoint[0], box.endPoint[1] - box.startPoint[1],
];
const shiftVector = [boxSize[0] * shiftFactor[0], boxSize[1] * shiftFactor[1]];
const startPoint = [box.startPoint[0] + shiftVector[0], box.startPoint[1] + shiftVector[1]];
const endPoint = [box.endPoint[0] + shiftVector[0], box.endPoint[1] + shiftVector[1]];
return { startPoint, endPoint, palmLandmarks: box.palmLandmarks };
}
exports.shiftBox = shiftBox;

107
src/handpose/hand.js Normal file
View File

@ -0,0 +1,107 @@
const tf = require('@tensorflow/tfjs');
const bounding = require('./box');
class HandDetector {
constructor(model, width, height, anchors, iouThreshold, scoreThreshold) {
this.model = model;
this.width = width;
this.height = height;
this.iouThreshold = iouThreshold;
this.scoreThreshold = scoreThreshold;
this.anchors = anchors.map((anchor) => [anchor.x_center, anchor.y_center]);
this.anchorsTensor = tf.tensor2d(this.anchors);
this.inputSizeTensor = tf.tensor1d([width, height]);
this.doubleInputSizeTensor = tf.tensor1d([width * 2, height * 2]);
}
normalizeBoxes(boxes) {
return tf.tidy(() => {
const boxOffsets = tf.slice(boxes, [0, 0], [-1, 2]);
const boxSizes = tf.slice(boxes, [0, 2], [-1, 2]);
const boxCenterPoints = tf.add(tf.div(boxOffsets, this.inputSizeTensor), this.anchorsTensor);
const halfBoxSizes = tf.div(boxSizes, this.doubleInputSizeTensor);
const startPoints = tf.mul(tf.sub(boxCenterPoints, halfBoxSizes), this.inputSizeTensor);
const endPoints = tf.mul(tf.add(boxCenterPoints, halfBoxSizes), this.inputSizeTensor);
return tf.concat2d([startPoints, endPoints], 1);
});
}
normalizeLandmarks(rawPalmLandmarks, index) {
return tf.tidy(() => {
const landmarks = tf.add(tf.div(rawPalmLandmarks.reshape([-1, 7, 2]), this.inputSizeTensor), this.anchors[index]);
return tf.mul(landmarks, this.inputSizeTensor);
});
}
async getBoundingBoxes(input) {
const normalizedInput = tf.tidy(() => tf.mul(tf.sub(input, 0.5), 2));
let batchedPrediction;
if (tf.getBackend() === 'webgl') {
// Currently tfjs-core does not pack depthwiseConv because it fails for
// very large inputs (https://github.com/tensorflow/tfjs/issues/1652).
// TODO(annxingyuan): call tf.enablePackedDepthwiseConv when available
// (https://github.com/tensorflow/tfjs/issues/2821)
const savedWebglPackDepthwiseConvFlag = tf.env().get('WEBGL_PACK_DEPTHWISECONV');
tf.env().set('WEBGL_PACK_DEPTHWISECONV', true);
// The model returns a tensor with the following shape:
// [1 (batch), 2944 (anchor points), 19 (data for each anchor)]
batchedPrediction = this.model.predict(normalizedInput);
tf.env().set('WEBGL_PACK_DEPTHWISECONV', savedWebglPackDepthwiseConvFlag);
} else {
batchedPrediction = this.model.predict(normalizedInput);
}
const prediction = batchedPrediction.squeeze();
// Regression score for each anchor point.
const scores = tf.tidy(() => tf.sigmoid(tf.slice(prediction, [0, 0], [-1, 1])).squeeze());
// Bounding box for each anchor point.
const rawBoxes = tf.slice(prediction, [0, 1], [-1, 4]);
const boxes = this.normalizeBoxes(rawBoxes);
const boxesWithHandsTensor = await tf.image.nonMaxSuppressionAsync(boxes, scores, 1, this.iouThreshold, this.scoreThreshold);
const boxesWithHands = await boxesWithHandsTensor.array();
const toDispose = [
normalizedInput, batchedPrediction, boxesWithHandsTensor, prediction,
boxes, rawBoxes, scores,
];
if (boxesWithHands.length === 0) {
toDispose.forEach((tensor) => tensor.dispose());
return null;
}
const boxIndex = boxesWithHands[0];
const matchingBox = tf.slice(boxes, [boxIndex, 0], [1, -1]);
const rawPalmLandmarks = tf.slice(prediction, [boxIndex, 5], [1, 14]);
const palmLandmarks = tf.tidy(() => this.normalizeLandmarks(rawPalmLandmarks, boxIndex).reshape([
-1, 2,
]));
toDispose.push(rawPalmLandmarks);
toDispose.forEach((tensor) => tensor.dispose());
return { boxes: matchingBox, palmLandmarks };
}
/**
* Returns a Box identifying the bounding box of a hand within the image.
* Returns null if there is no hand in the image.
*
* @param input The image to classify.
*/
async estimateHandBounds(input) {
const inputHeight = input.shape[1];
const inputWidth = input.shape[2];
const image = tf.tidy(() => input.resizeBilinear([this.width, this.height]).div(255));
const prediction = await this.getBoundingBoxes(image);
if (prediction === null) {
image.dispose();
return null;
}
// Calling arraySync on both boxes and palmLandmarks because the tensors are
// very small so it's not worth calling await array().
const boundingBoxes = prediction.boxes.arraySync();
const startPoint = boundingBoxes[0].slice(0, 2);
const endPoint = boundingBoxes[0].slice(2, 4);
const palmLandmarks = prediction.palmLandmarks.arraySync();
image.dispose();
prediction.boxes.dispose();
prediction.palmLandmarks.dispose();
return bounding.scaleBoxCoordinates({ startPoint, endPoint, palmLandmarks }, [inputWidth / this.width, inputHeight / this.height]);
}
}
exports.HandDetector = HandDetector;

93
src/handpose/index.js Normal file
View File

@ -0,0 +1,93 @@
const tf = require('@tensorflow/tfjs');
const hand = require('./hand');
const keypoints = require('./keypoints');
const pipe = require('./pipeline');
// Load the bounding box detector model.
async function loadHandDetectorModel(url) {
return tf.loadGraphModel(url, { fromTFHub: url.includes('tfhub.dev') });
}
// Load the mesh detector model.
async function loadHandPoseModel(url) {
return tf.loadGraphModel(url, { fromTFHub: url.includes('tfhub.dev') });
}
// In single shot detector pipelines, the output space is discretized into a set
// of bounding boxes, each of which is assigned a score during prediction. The
// anchors define the coordinates of these boxes.
async function loadAnchors(url) {
return tf.util
.fetch(url)
.then((d) => d.json());
}
/**
* Load handpose.
*
* @param config A configuration object with the following properties:
* - `maxContinuousChecks` How many frames to go without running the bounding
* box detector. Defaults to infinity. Set to a lower value if you want a safety
* net in case the mesh detector produces consistently flawed predictions.
* - `detectionConfidence` Threshold for discarding a prediction. Defaults to
* 0.8.
* - `iouThreshold` A float representing the threshold for deciding whether
* boxes overlap too much in non-maximum suppression. Must be between [0, 1].
* Defaults to 0.3.
* - `scoreThreshold` A threshold for deciding when to remove boxes based
* on score in non-maximum suppression. Defaults to 0.75.
*/
async function load(config) {
const [ANCHORS, handDetectorModel, handPoseModel] = await Promise.all([
loadAnchors(config.detector.anchors),
loadHandDetectorModel(config.detector.modelPath),
loadHandPoseModel(config.skeleton.modelPath),
]);
const detector = new hand.HandDetector(handDetectorModel, config.inputSize, config.inputSize, ANCHORS, config.iouThreshold, config.scoreThreshold);
const pipeline = new pipe.HandPipeline(detector, handPoseModel, config.inputSize, config.inputSize, config.skipFrames, config.minConfidence);
// eslint-disable-next-line no-use-before-define
const handpose = new HandPose(pipeline);
return handpose;
}
exports.load = load;
class HandPose {
constructor(pipeline) {
this.pipeline = pipeline;
}
static getAnnotations() {
return keypoints.MESH_ANNOTATIONS;
}
/**
* Finds hands in the input image.
*
* @param input The image to classify. Can be a tensor, DOM element image,
* video, or canvas.
* @param flipHorizontal Whether to flip the hand keypoints horizontally.
* Should be true for videos that are flipped by default (e.g. webcams).
*/
async estimateHands(input, config) {
const image = tf.tidy(() => {
if (!(input instanceof tf.Tensor)) {
input = tf.browser.fromPixels(input);
}
return input.toFloat().expandDims(0);
});
const prediction = await this.pipeline.estimateHand(image, config);
image.dispose();
if (!prediction) return [];
const annotations = {};
for (const key of Object.keys(keypoints.MESH_ANNOTATIONS)) {
annotations[key] = keypoints.MESH_ANNOTATIONS[key].map((index) => prediction.landmarks[index]);
}
return [{
confidence: prediction.confidence || 0,
box: prediction.box ? [prediction.box.topLeft[0], prediction.box.topLeft[1], prediction.box.bottomRight[0] - prediction.box.topLeft[0], prediction.box.bottomRight[1] - prediction.box.topLeft[1]] : 0,
landmarks: prediction.landmarks,
annotations,
}];
}
}
exports.HandPose = HandPose;

View File

@ -0,0 +1,8 @@
exports.MESH_ANNOTATIONS = {
thumb: [1, 2, 3, 4],
indexFinger: [5, 6, 7, 8],
middleFinger: [9, 10, 11, 12],
ringFinger: [13, 14, 15, 16],
pinky: [17, 18, 19, 20],
palmBase: [0],
};

193
src/handpose/pipeline.js Normal file
View File

@ -0,0 +1,193 @@
const tf = require('@tensorflow/tfjs');
const bounding = require('./box');
const util = require('./util');
const UPDATE_REGION_OF_INTEREST_IOU_THRESHOLD = 0.8;
const PALM_BOX_SHIFT_VECTOR = [0, -0.4];
const PALM_BOX_ENLARGE_FACTOR = 3;
const HAND_BOX_SHIFT_VECTOR = [0, -0.1];
const HAND_BOX_ENLARGE_FACTOR = 1.65;
const PALM_LANDMARK_IDS = [0, 5, 9, 13, 17, 1, 2];
const PALM_LANDMARKS_INDEX_OF_PALM_BASE = 0;
const PALM_LANDMARKS_INDEX_OF_MIDDLE_FINGER_BASE = 2;
// The Pipeline coordinates between the bounding box and skeleton models.
class HandPipeline {
constructor(boundingBoxDetector, meshDetector, meshWidth, meshHeight, maxContinuousChecks, detectionConfidence) {
// An array of hand bounding boxes.
this.regionsOfInterest = [];
this.runsWithoutHandDetector = 0;
this.boundingBoxDetector = boundingBoxDetector;
this.meshDetector = meshDetector;
this.maxContinuousChecks = maxContinuousChecks;
this.detectionConfidence = detectionConfidence;
this.meshWidth = meshWidth;
this.meshHeight = meshHeight;
this.maxHandsNumber = 1; // TODO(annxingyuan): Add multi-hand support.
}
// Get the bounding box surrounding the hand, given palm landmarks.
getBoxForPalmLandmarks(palmLandmarks, rotationMatrix) {
const rotatedPalmLandmarks = palmLandmarks.map((coord) => {
const homogeneousCoordinate = [...coord, 1];
return util.rotatePoint(homogeneousCoordinate, rotationMatrix);
});
const boxAroundPalm = this.calculateLandmarksBoundingBox(rotatedPalmLandmarks);
// boxAroundPalm only surrounds the palm - therefore we shift it
// upwards so it will capture fingers once enlarged + squarified.
return bounding.enlargeBox(bounding.squarifyBox(bounding.shiftBox(boxAroundPalm, PALM_BOX_SHIFT_VECTOR)), PALM_BOX_ENLARGE_FACTOR);
}
// Get the bounding box surrounding the hand, given all hand landmarks.
getBoxForHandLandmarks(landmarks) {
// The MediaPipe hand mesh model is trained on hands with empty space
// around them, so we still need to shift / enlarge boxAroundHand even
// though it surrounds the entire hand.
const boundingBox = this.calculateLandmarksBoundingBox(landmarks);
const boxAroundHand = bounding.enlargeBox(bounding.squarifyBox(bounding.shiftBox(boundingBox, HAND_BOX_SHIFT_VECTOR)), HAND_BOX_ENLARGE_FACTOR);
const palmLandmarks = [];
for (let i = 0; i < PALM_LANDMARK_IDS.length; i++) {
palmLandmarks.push(landmarks[PALM_LANDMARK_IDS[i]].slice(0, 2));
}
boxAroundHand.palmLandmarks = palmLandmarks;
return boxAroundHand;
}
// Scale, rotate, and translate raw keypoints from the model so they map to
// the input coordinates.
transformRawCoords(rawCoords, box, angle, rotationMatrix) {
const boxSize = bounding.getBoxSize(box);
const scaleFactor = [boxSize[0] / this.meshWidth, boxSize[1] / this.meshHeight];
const coordsScaled = rawCoords.map((coord) => [
scaleFactor[0] * (coord[0] - this.meshWidth / 2),
scaleFactor[1] * (coord[1] - this.meshHeight / 2), coord[2],
]);
const coordsRotationMatrix = util.buildRotationMatrix(angle, [0, 0]);
const coordsRotated = coordsScaled.map((coord) => {
const rotated = util.rotatePoint(coord, coordsRotationMatrix);
return [...rotated, coord[2]];
});
const inverseRotationMatrix = util.invertTransformMatrix(rotationMatrix);
const boxCenter = [...bounding.getBoxCenter(box), 1];
const originalBoxCenter = [
util.dot(boxCenter, inverseRotationMatrix[0]),
util.dot(boxCenter, inverseRotationMatrix[1]),
];
return coordsRotated.map((coord) => [
coord[0] + originalBoxCenter[0], coord[1] + originalBoxCenter[1],
coord[2],
]);
}
async estimateHand(image, config) {
const useFreshBox = this.shouldUpdateRegionsOfInterest();
if (useFreshBox === true) {
const boundingBoxPrediction = await this.boundingBoxDetector.estimateHandBounds(image);
if (boundingBoxPrediction === null) {
image.dispose();
this.regionsOfInterest = [];
return null;
}
this.updateRegionsOfInterest(boundingBoxPrediction, true /* force update */);
this.runsWithoutHandDetector = 0;
} else {
this.runsWithoutHandDetector++;
}
// Rotate input so the hand is vertically oriented.
const currentBox = this.regionsOfInterest[0];
const angle = util.computeRotation(currentBox.palmLandmarks[PALM_LANDMARKS_INDEX_OF_PALM_BASE], currentBox.palmLandmarks[PALM_LANDMARKS_INDEX_OF_MIDDLE_FINGER_BASE]);
const palmCenter = bounding.getBoxCenter(currentBox);
const palmCenterNormalized = [palmCenter[0] / image.shape[2], palmCenter[1] / image.shape[1]];
const rotatedImage = tf.image.rotateWithOffset(image, angle, 0, palmCenterNormalized);
const rotationMatrix = util.buildRotationMatrix(-angle, palmCenter);
// The bounding box detector only detects palms, so if we're using a fresh
// bounding box prediction, we have to construct the hand bounding box from
// the palm keypoints.
const box = useFreshBox ? this.getBoxForPalmLandmarks(currentBox.palmLandmarks, rotationMatrix) : currentBox;
const croppedInput = bounding.cutBoxFromImageAndResize(box, rotatedImage, [this.meshWidth, this.meshHeight]);
const handImage = croppedInput.div(255);
croppedInput.dispose();
rotatedImage.dispose();
let prediction;
if (tf.getBackend() === 'webgl') {
// Currently tfjs-core does not pack depthwiseConv because it fails for
// very large inputs (https://github.com/tensorflow/tfjs/issues/1652).
// TODO(annxingyuan): call tf.enablePackedDepthwiseConv when available
// (https://github.com/tensorflow/tfjs/issues/2821)
const savedWebglPackDepthwiseConvFlag = tf.env().get('WEBGL_PACK_DEPTHWISECONV');
tf.env().set('WEBGL_PACK_DEPTHWISECONV', true);
prediction = this.meshDetector.predict(handImage);
tf.env().set('WEBGL_PACK_DEPTHWISECONV', savedWebglPackDepthwiseConvFlag);
} else {
prediction = this.meshDetector.predict(handImage);
}
const [flag, keypoints] = prediction;
handImage.dispose();
const flagValue = flag.dataSync()[0];
flag.dispose();
if (flagValue < config.minConfidence) {
keypoints.dispose();
this.regionsOfInterest = [];
return null;
}
const keypointsReshaped = tf.reshape(keypoints, [-1, 3]);
// Calling arraySync() because the tensor is very small so it's not worth
// calling await array().
const rawCoords = keypointsReshaped.arraySync();
keypoints.dispose();
keypointsReshaped.dispose();
const coords = this.transformRawCoords(rawCoords, box, angle, rotationMatrix);
const nextBoundingBox = this.getBoxForHandLandmarks(coords);
this.updateRegionsOfInterest(nextBoundingBox, false /* force replace */);
const result = {
landmarks: coords,
confidence: flagValue,
box: {
topLeft: nextBoundingBox.startPoint,
bottomRight: nextBoundingBox.endPoint,
},
};
return result;
}
// eslint-disable-next-line class-methods-use-this
calculateLandmarksBoundingBox(landmarks) {
const xs = landmarks.map((d) => d[0]);
const ys = landmarks.map((d) => d[1]);
const startPoint = [Math.min(...xs), Math.min(...ys)];
const endPoint = [Math.max(...xs), Math.max(...ys)];
return { startPoint, endPoint };
}
// Updates regions of interest if the intersection over union between
// the incoming and previous regions falls below a threshold.
updateRegionsOfInterest(box, forceUpdate) {
if (forceUpdate) {
this.regionsOfInterest = [box];
} else {
const previousBox = this.regionsOfInterest[0];
let iou = 0;
if (previousBox != null && previousBox.startPoint != null) {
const [boxStartX, boxStartY] = box.startPoint;
const [boxEndX, boxEndY] = box.endPoint;
const [previousBoxStartX, previousBoxStartY] = previousBox.startPoint;
const [previousBoxEndX, previousBoxEndY] = previousBox.endPoint;
const xStartMax = Math.max(boxStartX, previousBoxStartX);
const yStartMax = Math.max(boxStartY, previousBoxStartY);
const xEndMin = Math.min(boxEndX, previousBoxEndX);
const yEndMin = Math.min(boxEndY, previousBoxEndY);
const intersection = (xEndMin - xStartMax) * (yEndMin - yStartMax);
const boxArea = (boxEndX - boxStartX) * (boxEndY - boxStartY);
const previousBoxArea = (previousBoxEndX - previousBoxStartX) * (previousBoxEndY - boxStartY);
iou = intersection / (boxArea + previousBoxArea - intersection);
}
this.regionsOfInterest[0] = iou > UPDATE_REGION_OF_INTEREST_IOU_THRESHOLD ? previousBox : box;
}
}
shouldUpdateRegionsOfInterest() {
const roisCount = this.regionsOfInterest.length;
return roisCount !== this.maxHandsNumber || this.runsWithoutHandDetector >= this.maxContinuousChecks;
}
}
exports.HandPipeline = HandPipeline;

68
src/handpose/util.js Normal file
View File

@ -0,0 +1,68 @@
function normalizeRadians(angle) {
return angle - 2 * Math.PI * Math.floor((angle + Math.PI) / (2 * Math.PI));
}
exports.normalizeRadians = normalizeRadians;
function computeRotation(point1, point2) {
const radians = Math.PI / 2 - Math.atan2(-(point2[1] - point1[1]), point2[0] - point1[0]);
return normalizeRadians(radians);
}
exports.computeRotation = computeRotation;
const buildTranslationMatrix = (x, y) => ([[1, 0, x], [0, 1, y], [0, 0, 1]]);
function dot(v1, v2) {
let product = 0;
for (let i = 0; i < v1.length; i++) {
product += v1[i] * v2[i];
}
return product;
}
exports.dot = dot;
function getColumnFrom2DArr(arr, columnIndex) {
const column = [];
for (let i = 0; i < arr.length; i++) {
column.push(arr[i][columnIndex]);
}
return column;
}
exports.getColumnFrom2DArr = getColumnFrom2DArr;
function multiplyTransformMatrices(mat1, mat2) {
const product = [];
const size = mat1.length;
for (let row = 0; row < size; row++) {
product.push([]);
for (let col = 0; col < size; col++) {
product[row].push(dot(mat1[row], getColumnFrom2DArr(mat2, col)));
}
}
return product;
}
function buildRotationMatrix(rotation, center) {
const cosA = Math.cos(rotation);
const sinA = Math.sin(rotation);
const rotationMatrix = [[cosA, -sinA, 0], [sinA, cosA, 0], [0, 0, 1]];
const translationMatrix = buildTranslationMatrix(center[0], center[1]);
const translationTimesRotation = multiplyTransformMatrices(translationMatrix, rotationMatrix);
const negativeTranslationMatrix = buildTranslationMatrix(-center[0], -center[1]);
return multiplyTransformMatrices(translationTimesRotation, negativeTranslationMatrix);
}
exports.buildRotationMatrix = buildRotationMatrix;
function invertTransformMatrix(matrix) {
const rotationComponent = [[matrix[0][0], matrix[1][0]], [matrix[0][1], matrix[1][1]]];
const translationComponent = [matrix[0][2], matrix[1][2]];
const invertedTranslation = [
-dot(rotationComponent[0], translationComponent),
-dot(rotationComponent[1], translationComponent),
];
return [
rotationComponent[0].concat(invertedTranslation[0]),
rotationComponent[1].concat(invertedTranslation[1]),
[0, 0, 1],
];
}
exports.invertTransformMatrix = invertTransformMatrix;
function rotatePoint(homogeneousCoordinate, rotationMatrix) {
return [
dot(homogeneousCoordinate, rotationMatrix[0]),
dot(homogeneousCoordinate, rotationMatrix[1]),
];
}
exports.rotatePoint = rotatePoint;

127
src/image.js Normal file
View File

@ -0,0 +1,127 @@
const defaultFont = 'small-caps 1rem "Segoe UI"';
function clear(canvas) {
if (canvas) canvas.getContext('2d').clearRect(0, 0, canvas.width, canvas.height);
}
function crop(image, x, y, width, height, { color = 'white', title = null, font = null }) {
const canvas = new OffscreenCanvas(width, height);
const ctx = canvas.getContext('2d');
ctx.drawImage(image, x, y, width, height, 0, 0, canvas.width, canvas.height);
ctx.fillStyle = color;
ctx.font = font || defaultFont;
if (title) ctx.fillText(title, 2, 16, canvas.width - 4);
return canvas;
}
function point({ canvas = null, x = 0, y = 0, color = 'white', radius = 2, title = null, font = null }) {
if (!canvas) return;
const ctx = canvas.getContext('2d');
ctx.fillStyle = color;
ctx.beginPath();
ctx.arc(x, y, radius, 0, 2 * Math.PI);
ctx.fill();
ctx.font = font || defaultFont;
if (title) ctx.fillText(title, x + 10, y + 4);
}
function rect({ canvas = null, x = 0, y = 0, width = 0, height = 0, radius = 8, lineWidth = 2, color = 'white', title = null, font = null }) {
if (!canvas) return;
const ctx = canvas.getContext('2d');
ctx.lineWidth = lineWidth;
ctx.beginPath();
ctx.moveTo(x + radius, y);
ctx.lineTo(x + width - radius, y);
ctx.quadraticCurveTo(x + width, y, x + width, y + radius);
ctx.lineTo(x + width, y + height - radius);
ctx.quadraticCurveTo(x + width, y + height, x + width - radius, y + height);
ctx.lineTo(x + radius, y + height);
ctx.quadraticCurveTo(x, y + height, x, y + height - radius);
ctx.lineTo(x, y + radius);
ctx.quadraticCurveTo(x, y, x + radius, y);
ctx.closePath();
ctx.strokeStyle = color;
ctx.stroke();
ctx.lineWidth = 1;
ctx.fillStyle = color;
ctx.font = font || defaultFont;
if (title) ctx.fillText(title, x + 4, y + 16);
}
function line({ points = [], canvas = null, lineWidth = 2, color = 'white', title = null, font = null }) {
if (!canvas) return;
if (points.length < 2) return;
const ctx = canvas.getContext('2d');
ctx.lineWidth = lineWidth;
ctx.beginPath();
ctx.moveTo(points[0][0], points[0][1]);
for (const pt of points) ctx.lineTo(pt[0], pt[1]);
ctx.strokeStyle = color;
ctx.fillStyle = color;
ctx.stroke();
ctx.lineWidth = 1;
ctx.font = font || defaultFont;
if (title) ctx.fillText(title, points[0][0] + 4, points[0][1] + 16);
}
function spline({ points = [], canvas = null, tension = 0.5, lineWidth = 2, color = 'white', title = null, font = null }) {
if (!canvas) return;
if (points.length < 2) return;
const va = (arr, i, j) => [arr[2 * j] - arr[2 * i], arr[2 * j + 1] - arr[2 * i + 1]];
const distance = (arr, i, j) => Math.sqrt(((arr[2 * i] - arr[2 * j]) ** 2) + ((arr[2 * i + 1] - arr[2 * j + 1]) ** 2));
// eslint-disable-next-line no-unused-vars
function ctlpts(x1, y1, x2, y2, x3, y3) {
// eslint-disable-next-line prefer-rest-params
const v = va(arguments, 0, 2);
// eslint-disable-next-line prefer-rest-params
const d01 = distance(arguments, 0, 1);
// eslint-disable-next-line prefer-rest-params
const d12 = distance(arguments, 1, 2);
const d012 = d01 + d12;
return [
x2 - v[0] * tension * d01 / d012, y2 - v[1] * tension * d01 / d012,
x2 + v[0] * tension * d12 / d012, y2 + v[1] * tension * d12 / d012,
];
}
const pts = [];
for (const pt of points) {
pts.push(pt[0]);
pts.push(pt[1]);
}
let cps = [];
for (let i = 0; i < pts.length - 2; i += 1) {
cps = cps.concat(ctlpts(pts[2 * i + 0], pts[2 * i + 1], pts[2 * i + 2], pts[2 * i + 3], pts[2 * i + 4], pts[2 * i + 5]));
}
const ctx = canvas.getContext('2d');
ctx.lineWidth = lineWidth;
ctx.strokeStyle = color;
ctx.fillStyle = color;
if (points.length === 2) {
ctx.beginPath();
ctx.moveTo(pts[0], pts[1]);
ctx.lineTo(pts[2], pts[3]);
} else {
ctx.beginPath();
ctx.moveTo(pts[0], pts[1]);
// first segment is a quadratic
ctx.quadraticCurveTo(cps[0], cps[1], pts[2], pts[3]);
// for all middle points, connect with bezier
let i;
for (i = 2; i < ((pts.length / 2) - 1); i += 1) {
ctx.bezierCurveTo(cps[(2 * (i - 1) - 1) * 2], cps[(2 * (i - 1) - 1) * 2 + 1], cps[(2 * (i - 1)) * 2], cps[(2 * (i - 1)) * 2 + 1], pts[i * 2], pts[i * 2 + 1]);
}
// last segment is a quadratic
ctx.quadraticCurveTo(cps[(2 * (i - 1) - 1) * 2], cps[(2 * (i - 1) - 1) * 2 + 1], pts[i * 2], pts[i * 2 + 1]);
}
ctx.stroke();
ctx.lineWidth = 1;
ctx.font = font || defaultFont;
if (title) ctx.fillText(title, points[0][0] + 4, points[0][1] + 16);
}
exports.crop = crop;
exports.rect = rect;
exports.point = point;
exports.line = line;
exports.spline = spline;
exports.clear = clear;

81
src/index.js Normal file
View File

@ -0,0 +1,81 @@
const facemesh = require('./facemesh/index.js');
const ssrnet = require('./ssrnet/index.js');
const posenet = require('./posenet/index.js');
const handpose = require('./handpose/index.js');
// const image = require('./image.js');
// const triangulation = require('./triangulation.js').default;
const defaults = require('./config.js').default;
const models = {
facemesh: null,
blazeface: null,
ssrnet: null,
iris: null,
};
function mergeDeep(...objects) {
const isObject = (obj) => obj && typeof obj === 'object';
return objects.reduce((prev, obj) => {
Object.keys(obj).forEach((key) => {
const pVal = prev[key];
const oVal = obj[key];
if (Array.isArray(pVal) && Array.isArray(oVal)) {
prev[key] = pVal.concat(...oVal);
} else if (isObject(pVal) && isObject(oVal)) {
prev[key] = mergeDeep(pVal, oVal);
} else {
prev[key] = oVal;
}
});
return prev;
}, {});
}
async function detect(input, userConfig) {
const config = mergeDeep(defaults, userConfig);
// run posenet
let poseRes = [];
if (config.body.enabled) {
if (!models.posenet) models.posenet = await posenet.load(config.body);
poseRes = await models.posenet.estimateMultiplePoses(input, config.body);
}
// run handpose
let handRes = [];
if (config.hand.enabled) {
if (!models.handpose) models.handpose = await handpose.load(config.hand);
handRes = await models.handpose.estimateHands(input, config.hand);
}
// run facemesh, includes blazeface and iris
const faceRes = [];
if (config.face.enabled) {
if (!models.facemesh) models.facemesh = await facemesh.load(config.face);
const faces = await models.facemesh.estimateFaces(input, config.face);
for (const face of faces) {
// run ssr-net age & gender, inherits face from blazeface
const ssrdata = (config.face.age.enabled || config.face.gender.enabled) ? await ssrnet.predict(face.image, config) : {};
// iris: array[ bottom, left, top, right, center ]
const iris = (face.annotations.leftEyeIris && face.annotations.rightEyeIris)
? Math.max(face.annotations.leftEyeIris[3][0] - face.annotations.leftEyeIris[1][0], face.annotations.rightEyeIris[3][0] - face.annotations.rightEyeIris[1][0])
: 0;
faceRes.push({
confidence: face.confidence,
box: face.box,
mesh: face.mesh,
annotations: face.annotations,
age: ssrdata.age,
gender: ssrdata.gender,
iris: (iris !== 0) ? Math.trunc(100 * 11.7 / iris) / 100 : 0,
});
}
}
// combine results
return { face: faceRes, body: poseRes, hand: handRes };
}
exports.detect = detect;
exports.defaults = defaults;
exports.models = models;

46
src/posenet/buildParts.js Normal file
View File

@ -0,0 +1,46 @@
const heapSort = require('./heapSort');
function scoreIsMaximumInLocalWindow(keypointId, score, heatmapY, heatmapX, localMaximumRadius, scores) {
const [height, width] = scores.shape;
let localMaximum = true;
const yStart = Math.max(heatmapY - localMaximumRadius, 0);
const yEnd = Math.min(heatmapY + localMaximumRadius + 1, height);
for (let yCurrent = yStart; yCurrent < yEnd; ++yCurrent) {
const xStart = Math.max(heatmapX - localMaximumRadius, 0);
const xEnd = Math.min(heatmapX + localMaximumRadius + 1, width);
for (let xCurrent = xStart; xCurrent < xEnd; ++xCurrent) {
if (scores.get(yCurrent, xCurrent, keypointId) > score) {
localMaximum = false;
break;
}
}
if (!localMaximum) {
break;
}
}
return localMaximum;
}
/**
* Builds a priority queue with part candidate positions for a specific image in
* the batch. For this we find all local maxima in the score maps with score
* values above a threshold. We create a single priority queue across all parts.
*/
function buildPartWithScoreQueue(scoreThreshold, localMaximumRadius, scores) {
const [height, width, numKeypoints] = scores.shape;
const queue = new heapSort.MaxHeap(height * width * numKeypoints, ({ score }) => score);
for (let heatmapY = 0; heatmapY < height; ++heatmapY) {
for (let heatmapX = 0; heatmapX < width; ++heatmapX) {
for (let keypointId = 0; keypointId < numKeypoints; ++keypointId) {
const score = scores.get(heatmapY, heatmapX, keypointId);
// Only consider parts with score greater or equal to threshold as root candidates.
if (score < scoreThreshold) continue;
// Only consider keypoints whose score is maximum in a local window.
if (scoreIsMaximumInLocalWindow(keypointId, score, heatmapY, heatmapX, localMaximumRadius, scores)) {
queue.enqueue({ score, part: { heatmapY, heatmapX, id: keypointId } });
}
}
}
}
return queue;
}
exports.buildPartWithScoreQueue = buildPartWithScoreQueue;

View File

@ -0,0 +1,104 @@
const buildParts = require('./buildParts');
const decodePose = require('./decodePose');
const vectors = require('./vectors');
function withinNmsRadiusOfCorrespondingPoint(poses, squaredNmsRadius, { x, y }, keypointId) {
return poses.some(({ keypoints }) => {
const correspondingKeypoint = keypoints[keypointId].position;
return vectors.squaredDistance(y, x, correspondingKeypoint.y, correspondingKeypoint.x) <= squaredNmsRadius;
});
}
/* Score the newly proposed object instance without taking into account
* the scores of the parts that overlap with any previously detected
* instance.
*/
function getInstanceScore(existingPoses, squaredNmsRadius, instanceKeypoints) {
const notOverlappedKeypointScores = instanceKeypoints.reduce((result, { position, score }, keypointId) => {
if (!withinNmsRadiusOfCorrespondingPoint(existingPoses, squaredNmsRadius, position, keypointId)) {
result += score;
}
return result;
}, 0.0);
return notOverlappedKeypointScores / instanceKeypoints.length;
}
// A point (y, x) is considered as root part candidate if its score is a
// maximum in a window |y - y'| <= kLocalMaximumRadius, |x - x'| <=
// kLocalMaximumRadius.
const kLocalMaximumRadius = 1;
/**
* Detects multiple poses and finds their parts from part scores and
* displacement vectors. It returns up to `maxDetections` object instance
* detections in decreasing root score order. It works as follows: We first
* create a priority queue with local part score maxima above
* `scoreThreshold`, considering all parts at the same time. Then we
* iteratively pull the top element of the queue (in decreasing score order)
* and treat it as a root candidate for a new object instance. To avoid
* duplicate detections, we reject the root candidate if it is within a disk
* of `nmsRadius` pixels from the corresponding part of a previously detected
* instance, which is a form of part-based non-maximum suppression (NMS). If
* the root candidate passes the NMS check, we start a new object instance
* detection, treating the corresponding part as root and finding the
* positions of the remaining parts by following the displacement vectors
* along the tree-structured part graph. We assign to the newly detected
* instance a score equal to the sum of scores of its parts which have not
* been claimed by a previous instance (i.e., those at least `nmsRadius`
* pixels away from the corresponding part of all previously detected
* instances), divided by the total number of parts `numParts`.
*
* @param heatmapScores 3-D tensor with shape `[height, width, numParts]`.
* The value of heatmapScores[y, x, k]` is the score of placing the `k`-th
* object part at position `(y, x)`.
*
* @param offsets 3-D tensor with shape `[height, width, numParts * 2]`.
* The value of [offsets[y, x, k], offsets[y, x, k + numParts]]` is the
* short range offset vector of the `k`-th object part at heatmap
* position `(y, x)`.
*
* @param displacementsFwd 3-D tensor of shape
* `[height, width, 2 * num_edges]`, where `num_edges = num_parts - 1` is the
* number of edges (parent-child pairs) in the tree. It contains the forward
* displacements between consecutive part from the root towards the leaves.
*
* @param displacementsBwd 3-D tensor of shape
* `[height, width, 2 * num_edges]`, where `num_edges = num_parts - 1` is the
* number of edges (parent-child pairs) in the tree. It contains the backward
* displacements between consecutive part from the root towards the leaves.
*
* @param outputStride The output stride that was used when feed-forwarding
* through the PoseNet model. Must be 32, 16, or 8.
*
* @param maxPoseDetections Maximum number of returned instance detections per
* image.
*
* @param scoreThreshold Only return instance detections that have root part
* score greater or equal to this value. Defaults to 0.5.
*
* @param nmsRadius Non-maximum suppression part distance. It needs to be
* strictly positive. Two parts suppress each other if they are less than
* `nmsRadius` pixels away. Defaults to 20.
*
* @return An array of poses and their scores, each containing keypoints and
* the corresponding keypoint scores.
*/
function decodeMultiplePoses(scoresBuffer, offsetsBuffer, displacementsFwdBuffer, displacementsBwdBuffer, outputStride, maxPoseDetections, scoreThreshold = 0.5, nmsRadius = 20) {
const poses = [];
const queue = buildParts.buildPartWithScoreQueue(scoreThreshold, kLocalMaximumRadius, scoresBuffer);
const squaredNmsRadius = nmsRadius * nmsRadius;
// Generate at most maxDetections object instances per image in
// decreasing root part score order.
while (poses.length < maxPoseDetections && !queue.empty()) {
// The top element in the queue is the next root candidate.
const root = queue.dequeue();
// Part-based non-maximum suppression: We reject a root candidate if it
// is within a disk of `nmsRadius` pixels from the corresponding part of
// a previously detected instance.
const rootImageCoords = vectors.getImageCoords(root.part, outputStride, offsetsBuffer);
if (withinNmsRadiusOfCorrespondingPoint(poses, squaredNmsRadius, rootImageCoords, root.part.id)) continue;
// Start a new detection instance at the position of the root.
const keypoints = decodePose.decodePose(root, scoresBuffer, offsetsBuffer, outputStride, displacementsFwdBuffer, displacementsBwdBuffer);
const score = getInstanceScore(poses, squaredNmsRadius, keypoints);
poses.push({ keypoints, score });
}
return poses;
}
exports.decodeMultiplePoses = decodeMultiplePoses;

84
src/posenet/decodePose.js Normal file
View File

@ -0,0 +1,84 @@
const keypoints = require('./keypoints');
const vectors = require('./vectors');
const parentChildrenTuples = keypoints.poseChain.map(([parentJoinName, childJoinName]) => ([keypoints.partIds[parentJoinName], keypoints.partIds[childJoinName]]));
const parentToChildEdges = parentChildrenTuples.map(([, childJointId]) => childJointId);
const childToParentEdges = parentChildrenTuples.map(([parentJointId]) => parentJointId);
function getDisplacement(edgeId, point, displacements) {
const numEdges = displacements.shape[2] / 2;
return {
y: displacements.get(point.y, point.x, edgeId),
x: displacements.get(point.y, point.x, numEdges + edgeId),
};
}
function getStridedIndexNearPoint(point, outputStride, height, width) {
return {
y: vectors.clamp(Math.round(point.y / outputStride), 0, height - 1),
x: vectors.clamp(Math.round(point.x / outputStride), 0, width - 1),
};
}
/**
* We get a new keypoint along the `edgeId` for the pose instance, assuming
* that the position of the `idSource` part is already known. For this, we
* follow the displacement vector from the source to target part (stored in
* the `i`-t channel of the displacement tensor). The displaced keypoint
* vector is refined using the offset vector by `offsetRefineStep` times.
*/
function traverseToTargetKeypoint(edgeId, sourceKeypoint, targetKeypointId, scoresBuffer, offsets, outputStride, displacements, offsetRefineStep = 2) {
const [height, width] = scoresBuffer.shape;
// Nearest neighbor interpolation for the source->target displacements.
const sourceKeypointIndices = getStridedIndexNearPoint(sourceKeypoint.position, outputStride, height, width);
const displacement = getDisplacement(edgeId, sourceKeypointIndices, displacements);
const displacedPoint = vectors.addVectors(sourceKeypoint.position, displacement);
let targetKeypoint = displacedPoint;
for (let i = 0; i < offsetRefineStep; i++) {
const targetKeypointIndices = getStridedIndexNearPoint(targetKeypoint, outputStride, height, width);
const offsetPoint = vectors.getOffsetPoint(targetKeypointIndices.y, targetKeypointIndices.x, targetKeypointId, offsets);
targetKeypoint = vectors.addVectors({
x: targetKeypointIndices.x * outputStride,
y: targetKeypointIndices.y * outputStride,
}, { x: offsetPoint.x, y: offsetPoint.y });
}
const targetKeyPointIndices = getStridedIndexNearPoint(targetKeypoint, outputStride, height, width);
const score = scoresBuffer.get(targetKeyPointIndices.y, targetKeyPointIndices.x, targetKeypointId);
return { position: targetKeypoint, part: keypoints.partNames[targetKeypointId], score };
}
/**
* Follows the displacement fields to decode the full pose of the object
* instance given the position of a part that acts as root.
*
* @return An array of decoded keypoints and their scores for a single pose
*/
function decodePose(root, scores, offsets, outputStride, displacementsFwd, displacementsBwd) {
const numParts = scores.shape[2];
const numEdges = parentToChildEdges.length;
const instanceKeypoints = new Array(numParts);
// Start a new detection instance at the position of the root.
const { part: rootPart, score: rootScore } = root;
const rootPoint = vectors.getImageCoords(rootPart, outputStride, offsets);
instanceKeypoints[rootPart.id] = {
score: rootScore,
part: keypoints.partNames[rootPart.id],
position: rootPoint,
};
// Decode the part positions upwards in the tree, following the backward
// displacements.
for (let edge = numEdges - 1; edge >= 0; --edge) {
const sourceKeypointId = parentToChildEdges[edge];
const targetKeypointId = childToParentEdges[edge];
if (instanceKeypoints[sourceKeypointId] && !instanceKeypoints[targetKeypointId]) {
instanceKeypoints[targetKeypointId] = traverseToTargetKeypoint(edge, instanceKeypoints[sourceKeypointId], targetKeypointId, scores, offsets, outputStride, displacementsBwd);
}
}
// Decode the part positions downwards in the tree, following the forward
// displacements.
for (let edge = 0; edge < numEdges; ++edge) {
const sourceKeypointId = childToParentEdges[edge];
const targetKeypointId = parentToChildEdges[edge];
if (instanceKeypoints[sourceKeypointId] && !instanceKeypoints[targetKeypointId]) {
instanceKeypoints[targetKeypointId] = traverseToTargetKeypoint(edge, instanceKeypoints[sourceKeypointId], targetKeypointId, scores, offsets, outputStride, displacementsFwd);
}
}
return instanceKeypoints;
}
exports.decodePose = decodePose;

View File

@ -0,0 +1,59 @@
const kpt = require('./keypoints');
const decoders = require('./decoders');
/**
* Detects a single pose and finds its parts from part scores and offset
* vectors. It returns a single pose detection. It works as follows:
* argmax2d is done on the scores to get the y and x index in the heatmap
* with the highest score for each part, which is essentially where the
* part is most likely to exist. This produces a tensor of size 17x2, with
* each row being the y and x index in the heatmap for each keypoint.
* The offset vector for each for each part is retrieved by getting the
* y and x from the offsets corresponding to the y and x index in the
* heatmap for that part. This produces a tensor of size 17x2, with each
* row being the offset vector for the corresponding keypoint.
* To get the keypoint, each parts heatmap y and x are multiplied
* by the output stride then added to their corresponding offset vector,
* which is in the same scale as the original image.
*
* @param heatmapScores 3-D tensor with shape `[height, width, numParts]`.
* The value of heatmapScores[y, x, k]` is the score of placing the `k`-th
* object part at position `(y, x)`.
*
* @param offsets 3-D tensor with shape `[height, width, numParts * 2]`.
* The value of [offsets[y, x, k], offsets[y, x, k + numParts]]` is the
* short range offset vector of the `k`-th object part at heatmap
* position `(y, x)`.
*
* @param outputStride The output stride that was used when feed-forwarding
* through the PoseNet model. Must be 32, 16, or 8.
*
* @return A promise that resolves with single pose with a confidence score,
* which contains an array of keypoints indexed by part id, each with a score
* and position.
*/
async function decodeSinglePose(heatmapScores, offsets, outputStride) {
let totalScore = 0.0;
const heatmapValues = decoders.argmax2d(heatmapScores);
const allTensorBuffers = await Promise.all([heatmapScores.buffer(), offsets.buffer(), heatmapValues.buffer()]);
const scoresBuffer = allTensorBuffers[0];
const offsetsBuffer = allTensorBuffers[1];
const heatmapValuesBuffer = allTensorBuffers[2];
const offsetPoints = decoders.getOffsetPoints(heatmapValuesBuffer, outputStride, offsetsBuffer);
const offsetPointsBuffer = await offsetPoints.buffer();
const keypointConfidence = Array.from(decoders.getPointsConfidence(scoresBuffer, heatmapValuesBuffer));
const keypoints = keypointConfidence.map((score, keypointId) => {
totalScore += score;
return {
position: {
y: offsetPointsBuffer.get(keypointId, 0),
x: offsetPointsBuffer.get(keypointId, 1),
},
part: kpt.partNames[keypointId],
score,
};
});
heatmapValues.dispose();
offsetPoints.dispose();
return { keypoints, score: totalScore / keypoints.length };
}
exports.decodeSinglePose = decodeSinglePose;

60
src/posenet/decoders.js Normal file
View File

@ -0,0 +1,60 @@
const tf = require('@tensorflow/tfjs');
const kpt = require('./keypoints');
function getPointsConfidence(heatmapScores, heatMapCoords) {
const numKeypoints = heatMapCoords.shape[0];
const result = new Float32Array(numKeypoints);
for (let keypoint = 0; keypoint < numKeypoints; keypoint++) {
const y = heatMapCoords.get(keypoint, 0);
const x = heatMapCoords.get(keypoint, 1);
result[keypoint] = heatmapScores.get(y, x, keypoint);
}
return result;
}
exports.getPointsConfidence = getPointsConfidence;
function getOffsetPoint(y, x, keypoint, offsetsBuffer) {
return {
y: offsetsBuffer.get(y, x, keypoint),
x: offsetsBuffer.get(y, x, keypoint + kpt.NUM_KEYPOINTS),
};
}
function getOffsetVectors(heatMapCoordsBuffer, offsetsBuffer) {
const result = [];
for (let keypoint = 0; keypoint < kpt.NUM_KEYPOINTS; keypoint++) {
const heatmapY = heatMapCoordsBuffer.get(keypoint, 0).valueOf();
const heatmapX = heatMapCoordsBuffer.get(keypoint, 1).valueOf();
const { x, y } = getOffsetPoint(heatmapY, heatmapX, keypoint, offsetsBuffer);
result.push(y);
result.push(x);
}
return tf.tensor2d(result, [kpt.NUM_KEYPOINTS, 2]);
}
exports.getOffsetVectors = getOffsetVectors;
function getOffsetPoints(heatMapCoordsBuffer, outputStride, offsetsBuffer) {
return tf.tidy(() => {
const offsetVectors = getOffsetVectors(heatMapCoordsBuffer, offsetsBuffer);
return heatMapCoordsBuffer.toTensor()
.mul(tf.scalar(outputStride, 'int32'))
.toFloat()
.add(offsetVectors);
});
}
exports.getOffsetPoints = getOffsetPoints;
function mod(a, b) {
return tf.tidy(() => {
const floored = a.div(tf.scalar(b, 'int32'));
return a.sub(floored.mul(tf.scalar(b, 'int32')));
});
}
function argmax2d(inputs) {
const [height, width, depth] = inputs.shape;
return tf.tidy(() => {
const reshaped = inputs.reshape([height * width, depth]);
const coords = reshaped.argMax(0);
const yCoords = coords.div(tf.scalar(width, 'int32')).expandDims(1);
const xCoords = mod(coords, width).expandDims(1);
return tf.concat([yCoords, xCoords], 1);
});
}
exports.argmax2d = argmax2d;

72
src/posenet/heapSort.js Normal file
View File

@ -0,0 +1,72 @@
// algorithm based on Coursera Lecture from Algorithms, Part 1: https://www.coursera.org/learn/algorithms-part1/lecture/ZjoSM/heapsort
function half(k) {
return Math.floor(k / 2);
}
class MaxHeap {
constructor(maxSize, getElementValue) {
this.priorityQueue = new Array(maxSize);
this.numberOfElements = -1;
this.getElementValue = getElementValue;
}
enqueue(x) {
this.priorityQueue[++this.numberOfElements] = x;
this.swim(this.numberOfElements);
}
dequeue() {
const max = this.priorityQueue[0];
this.exchange(0, this.numberOfElements--);
this.sink(0);
this.priorityQueue[this.numberOfElements + 1] = null;
return max;
}
empty() {
return this.numberOfElements === -1;
}
size() {
return this.numberOfElements + 1;
}
all() {
return this.priorityQueue.slice(0, this.numberOfElements + 1);
}
max() {
return this.priorityQueue[0];
}
swim(k) {
while (k > 0 && this.less(half(k), k)) {
this.exchange(k, half(k));
k = half(k);
}
}
sink(k) {
while (2 * k <= this.numberOfElements) {
let j = 2 * k;
if (j < this.numberOfElements && this.less(j, j + 1)) j++;
if (!this.less(k, j)) break;
this.exchange(k, j);
k = j;
}
}
getValueAt(i) {
return this.getElementValue(this.priorityQueue[i]);
}
less(i, j) {
return this.getValueAt(i) < this.getValueAt(j);
}
exchange(i, j) {
const t = this.priorityQueue[i];
this.priorityQueue[i] = this.priorityQueue[j];
this.priorityQueue[j] = t;
}
}
exports.MaxHeap = MaxHeap;

22
src/posenet/index.js Normal file
View File

@ -0,0 +1,22 @@
const modelMobileNet = require('./modelMobileNet');
const modelPoseNet = require('./modelPoseNet');
const decodeMultiple = require('./decodeMultiple');
const decodeSingle = require('./decodeSingle');
const keypoints = require('./keypoints');
const util = require('./util');
exports.load = modelPoseNet.load;
exports.PoseNet = modelPoseNet.PoseNet;
exports.MobileNet = modelMobileNet.MobileNet;
exports.decodeMultiplePoses = decodeMultiple.decodeMultiplePoses;
exports.decodeSinglePose = decodeSingle.decodeSinglePose;
exports.partChannels = keypoints.partChannels;
exports.partIds = keypoints.partIds;
exports.partNames = keypoints.partNames;
exports.poseChain = keypoints.poseChain;
exports.getAdjacentKeyPoints = util.getAdjacentKeyPoints;
exports.getBoundingBox = util.getBoundingBox;
exports.getBoundingBoxPoints = util.getBoundingBoxPoints;
exports.scaleAndFlipPoses = util.scaleAndFlipPoses;
exports.scalePose = util.scalePose;

61
src/posenet/keypoints.js Normal file
View File

@ -0,0 +1,61 @@
exports.partNames = [
'nose', 'leftEye', 'rightEye', 'leftEar', 'rightEar', 'leftShoulder',
'rightShoulder', 'leftElbow', 'rightElbow', 'leftWrist', 'rightWrist',
'leftHip', 'rightHip', 'leftKnee', 'rightKnee', 'leftAnkle', 'rightAnkle',
];
exports.NUM_KEYPOINTS = exports.partNames.length;
exports.partIds = exports.partNames.reduce((result, jointName, i) => {
result[jointName] = i;
return result;
}, {});
const connectedPartNames = [
['leftHip', 'leftShoulder'], ['leftElbow', 'leftShoulder'],
['leftElbow', 'leftWrist'], ['leftHip', 'leftKnee'],
['leftKnee', 'leftAnkle'], ['rightHip', 'rightShoulder'],
['rightElbow', 'rightShoulder'], ['rightElbow', 'rightWrist'],
['rightHip', 'rightKnee'], ['rightKnee', 'rightAnkle'],
['leftShoulder', 'rightShoulder'], ['leftHip', 'rightHip'],
];
/*
* Define the skeleton. This defines the parent->child relationships of our
* tree. Arbitrarily this defines the nose as the root of the tree, however
* since we will infer the displacement for both parent->child and
* child->parent, we can define the tree root as any node.
*/
exports.poseChain = [
['nose', 'leftEye'], ['leftEye', 'leftEar'], ['nose', 'rightEye'],
['rightEye', 'rightEar'], ['nose', 'leftShoulder'],
['leftShoulder', 'leftElbow'], ['leftElbow', 'leftWrist'],
['leftShoulder', 'leftHip'], ['leftHip', 'leftKnee'],
['leftKnee', 'leftAnkle'], ['nose', 'rightShoulder'],
['rightShoulder', 'rightElbow'], ['rightElbow', 'rightWrist'],
['rightShoulder', 'rightHip'], ['rightHip', 'rightKnee'],
['rightKnee', 'rightAnkle'],
];
exports.connectedPartIndices = connectedPartNames.map(([jointNameA, jointNameB]) => ([exports.partIds[jointNameA], exports.partIds[jointNameB]]));
exports.partChannels = [
'left_face',
'right_face',
'right_upper_leg_front',
'right_lower_leg_back',
'right_upper_leg_back',
'left_lower_leg_front',
'left_upper_leg_front',
'left_upper_leg_back',
'left_lower_leg_back',
'right_feet',
'right_lower_leg_front',
'left_feet',
'torso_front',
'torso_back',
'right_upper_arm_front',
'right_upper_arm_back',
'right_lower_arm_back',
'left_lower_arm_front',
'left_upper_arm_front',
'left_upper_arm_back',
'left_lower_arm_back',
'right_hand',
'right_lower_arm_front',
'left_hand',
];

55
src/posenet/modelBase.js Normal file
View File

@ -0,0 +1,55 @@
const tf = require('@tensorflow/tfjs');
/**
* PoseNet supports using various convolution neural network models
* (e.g. ResNet and MobileNetV1) as its underlying base model.
* The following BaseModel interface defines a unified interface for
* creating such PoseNet base models. Currently both MobileNet (in
* ./mobilenet.ts) and ResNet (in ./resnet.ts) implements the BaseModel
* interface. New base models that conform to the BaseModel interface can be
* added to PoseNet.
*/
class BaseModel {
constructor(model, outputStride) {
this.model = model;
this.outputStride = outputStride;
const inputShape = this.model.inputs[0].shape;
tf.util.assert((inputShape[1] === -1) && (inputShape[2] === -1), () => `Input shape [${inputShape[1]}, ${inputShape[2]}] must both be equal to or -1`);
}
/**
* Predicts intermediate Tensor representations.
*
* @param input The input RGB image of the base model.
* A Tensor of shape: [`inputResolution`, `inputResolution`, 3].
*
* @return A dictionary of base model's intermediate predictions.
* The returned dictionary should contains the following elements:
* heatmapScores: A Tensor3D that represents the heatmapScores.
* offsets: A Tensor3D that represents the offsets.
* displacementFwd: A Tensor3D that represents the forward displacement.
* displacementBwd: A Tensor3D that represents the backward displacement.
*/
predict(input) {
return tf.tidy(() => {
const asFloat = this.preprocessInput(input.toFloat());
const asBatch = asFloat.expandDims(0);
const results = this.model.predict(asBatch);
const results3d = results.map((y) => y.squeeze([0]));
const namedResults = this.nameOutputResults(results3d);
return {
heatmapScores: namedResults.heatmap.sigmoid(),
offsets: namedResults.offsets,
displacementFwd: namedResults.displacementFwd,
displacementBwd: namedResults.displacementBwd,
};
});
}
/**
* Releases the CPU and GPU memory allocated by the model.
*/
dispose() {
this.model.dispose();
}
}
exports.BaseModel = BaseModel;

View File

@ -0,0 +1,17 @@
const tf = require('@tensorflow/tfjs');
const modelBase = require('./modelBase');
class MobileNet extends modelBase.BaseModel {
// eslint-disable-next-line class-methods-use-this
preprocessInput(input) {
// Normalize the pixels [0, 255] to be between [-1, 1].
return tf.tidy(() => tf.div(input, 127.5).sub(1.0));
}
// eslint-disable-next-line class-methods-use-this
nameOutputResults(results) {
const [offsets, heatmap, displacementFwd, displacementBwd] = results;
return { offsets, heatmap, displacementFwd, displacementBwd };
}
}
exports.MobileNet = MobileNet;

113
src/posenet/modelPoseNet.js Normal file
View File

@ -0,0 +1,113 @@
const tf = require('@tensorflow/tfjs');
const modelMobileNet = require('./modelMobileNet');
const decodeMultiple = require('./decodeMultiple');
const decodeSingle = require('./decodeSingle');
const util = require('./util');
class PoseNet {
constructor(net, inputResolution) {
this.baseModel = net;
this.inputResolution = inputResolution;
}
/**
* Infer through PoseNet, and estimates multiple poses using the outputs.
* This does standard ImageNet pre-processing before inferring through the
* model. The image should pixels should have values [0-255]. It detects
* multiple poses and finds their parts from part scores and displacement
* vectors using a fast greedy decoding algorithm. It returns up to
* `config.maxDetections` object instance detections in decreasing root
* score order.
*
* @param input
* ImageData|HTMLImageElement|HTMLCanvasElement|HTMLVideoElement) The input
* image to feed through the network.
*
* @param config MultiPoseEstimationConfig object that contains parameters
* for the PoseNet inference using multiple pose estimation.
*
* @return An array of poses and their scores, each containing keypoints and
* the corresponding keypoint scores. The positions of the keypoints are
* in the same scale as the original image
*/
async estimateMultiplePoses(input, config) {
const outputStride = this.baseModel.outputStride;
const inputResolution = this.inputResolution;
const [height, width] = util.getInputTensorDimensions(input);
const { resized, padding } = util.padAndResizeTo(input, [inputResolution, inputResolution]);
const { heatmapScores, offsets, displacementFwd, displacementBwd } = this.baseModel.predict(resized);
const allTensorBuffers = await util.toTensorBuffers3D([heatmapScores, offsets, displacementFwd, displacementBwd]);
const scoresBuffer = allTensorBuffers[0];
const offsetsBuffer = allTensorBuffers[1];
const displacementsFwdBuffer = allTensorBuffers[2];
const displacementsBwdBuffer = allTensorBuffers[3];
const poses = await decodeMultiple.decodeMultiplePoses(scoresBuffer, offsetsBuffer, displacementsFwdBuffer, displacementsBwdBuffer, outputStride, config.maxDetections, config.scoreThreshold, config.nmsRadius);
const resultPoses = util.scaleAndFlipPoses(poses, [height, width], [inputResolution, inputResolution], padding);
heatmapScores.dispose();
offsets.dispose();
displacementFwd.dispose();
displacementBwd.dispose();
resized.dispose();
return resultPoses;
}
/**
* Infer through PoseNet, and estimates a single pose using the outputs.
* This does standard ImageNet pre-processing before inferring through the
* model. The image should pixels should have values [0-255]. It detects
* multiple poses and finds their parts from part scores and displacement
* vectors using a fast greedy decoding algorithm. It returns a single pose
*
* @param input
* ImageData|HTMLImageElement|HTMLCanvasElement|HTMLVideoElement) The input
* image to feed through the network.
*
* @param config SinglePersonEstimationConfig object that contains
* parameters for the PoseNet inference using single pose estimation.
*
* @return An pose and its scores, containing keypoints and
* the corresponding keypoint scores. The positions of the keypoints are
* in the same scale as the original image
*/
async estimateSinglePose(input) {
const outputStride = this.baseModel.outputStride;
const inputResolution = this.inputResolution;
const [height, width] = util.getInputTensorDimensions(input);
const { resized, padding } = util.padAndResizeTo(input, inputResolution);
const { heatmapScores, offsets, displacementFwd, displacementBwd } = this.baseModel.predict(resized);
const pose = await decodeSingle.decodeSinglePose(heatmapScores, offsets, outputStride);
const poses = [pose];
const resultPoses = util.scaleAndFlipPoses(poses, [height, width], [inputResolution, inputResolution], padding);
heatmapScores.dispose();
offsets.dispose();
displacementFwd.dispose();
displacementBwd.dispose();
resized.dispose();
return resultPoses[0];
}
dispose() {
this.baseModel.dispose();
}
}
exports.PoseNet = PoseNet;
async function loadMobileNet(config) {
const outputStride = config.outputStride;
const graphModel = await tf.loadGraphModel(config.modelPath);
const mobilenet = new modelMobileNet.MobileNet(graphModel, outputStride);
return new PoseNet(mobilenet, config.inputResolution);
}
/**
* Loads the PoseNet model instance from a checkpoint, with the MobileNet architecture. The model to be loaded is configurable using the
* config dictionary ModelConfig. Please find more details in the documentation of the ModelConfig.
*
* @param config ModelConfig dictionary that contains parameters for
* the PoseNet loading process. Please find more details of each parameters
* in the documentation of the ModelConfig interface. The predefined
* `MOBILENET_V1_CONFIG` and `RESNET_CONFIG` can also be used as references
* for defining your customized config.
*/
async function load(config) {
return loadMobileNet(config);
}
exports.load = load;

View File

@ -0,0 +1,28 @@
class ModelWeights {
constructor(variables) {
this.variables = variables;
}
weights(layerName) {
return this.variables[`MobilenetV1/${layerName}/weights`];
}
depthwiseBias(layerName) {
return this.variables[`MobilenetV1/${layerName}/biases`];
}
convBias(layerName) {
return this.depthwiseBias(layerName);
}
depthwiseWeights(layerName) {
return this.variables[`MobilenetV1/${layerName}/depthwise_weights`];
}
dispose() {
for (const varName in this.variables) {
this.variables[varName].dispose();
}
}
}
exports.ModelWeights = ModelWeights;

118
src/posenet/util.js Normal file
View File

@ -0,0 +1,118 @@
const tf = require('@tensorflow/tfjs');
const kpt = require('./keypoints');
function eitherPointDoesntMeetConfidence(a, b, minConfidence) {
return (a < minConfidence || b < minConfidence);
}
function getAdjacentKeyPoints(keypoints, minConfidence) {
return kpt.connectedPartIndices.reduce((result, [leftJoint, rightJoint]) => {
if (eitherPointDoesntMeetConfidence(keypoints[leftJoint].score, keypoints[rightJoint].score, minConfidence)) {
return result;
}
result.push([keypoints[leftJoint], keypoints[rightJoint]]);
return result;
}, []);
}
exports.getAdjacentKeyPoints = getAdjacentKeyPoints;
const { NEGATIVE_INFINITY, POSITIVE_INFINITY } = Number;
function getBoundingBox(keypoints) {
return keypoints.reduce(({ maxX, maxY, minX, minY }, { position: { x, y } }) => ({
maxX: Math.max(maxX, x),
maxY: Math.max(maxY, y),
minX: Math.min(minX, x),
minY: Math.min(minY, y),
}), {
maxX: NEGATIVE_INFINITY,
maxY: NEGATIVE_INFINITY,
minX: POSITIVE_INFINITY,
minY: POSITIVE_INFINITY,
});
}
exports.getBoundingBox = getBoundingBox;
function getBoundingBoxPoints(keypoints) {
const { minX, minY, maxX, maxY } = getBoundingBox(keypoints);
return [{ x: minX, y: minY }, { x: maxX, y: minY }, { x: maxX, y: maxY }, { x: minX, y: maxY }];
}
exports.getBoundingBoxPoints = getBoundingBoxPoints;
async function toTensorBuffers3D(tensors) {
return Promise.all(tensors.map((tensor) => tensor.buffer()));
}
exports.toTensorBuffers3D = toTensorBuffers3D;
function scalePose(pose, scaleY, scaleX, offsetY = 0, offsetX = 0) {
return {
score: pose.score,
keypoints: pose.keypoints.map(({ score, part, position }) => ({
score,
part,
position: {
x: position.x * scaleX + offsetX,
y: position.y * scaleY + offsetY,
},
})),
};
}
exports.scalePose = scalePose;
function scalePoses(poses, scaleY, scaleX, offsetY = 0, offsetX = 0) {
if (scaleX === 1 && scaleY === 1 && offsetY === 0 && offsetX === 0) {
return poses;
}
return poses.map((pose) => scalePose(pose, scaleY, scaleX, offsetY, offsetX));
}
exports.scalePoses = scalePoses;
function getInputTensorDimensions(input) {
return input instanceof tf.Tensor ? [input.shape[0], input.shape[1]] : [input.height, input.width];
}
exports.getInputTensorDimensions = getInputTensorDimensions;
function toInputTensor(input) {
return input instanceof tf.Tensor ? input : tf.browser.fromPixels(input);
}
exports.toInputTensor = toInputTensor;
function toResizedInputTensor(input, resizeHeight, resizeWidth) {
return tf.tidy(() => {
const imageTensor = toInputTensor(input);
return imageTensor.resizeBilinear([resizeHeight, resizeWidth]);
});
}
exports.toResizedInputTensor = toResizedInputTensor;
function padAndResizeTo(input, [targetH, targetW]) {
const [height, width] = getInputTensorDimensions(input);
const targetAspect = targetW / targetH;
const aspect = width / height;
let [padT, padB, padL, padR] = [0, 0, 0, 0];
if (aspect < targetAspect) {
// pads the width
padT = 0;
padB = 0;
padL = Math.round(0.5 * (targetAspect * height - width));
padR = Math.round(0.5 * (targetAspect * height - width));
} else {
// pads the height
padT = Math.round(0.5 * ((1.0 / targetAspect) * width - height));
padB = Math.round(0.5 * ((1.0 / targetAspect) * width - height));
padL = 0;
padR = 0;
}
const resized = tf.tidy(() => {
let imageTensor = toInputTensor(input);
imageTensor = tf.pad3d(imageTensor, [[padT, padB], [padL, padR], [0, 0]]);
return imageTensor.resizeBilinear([targetH, targetW]);
});
return { resized, padding: { top: padT, left: padL, right: padR, bottom: padB } };
}
exports.padAndResizeTo = padAndResizeTo;
function scaleAndFlipPoses(poses, [height, width], [inputResolutionHeight, inputResolutionWidth], padding) {
const scaleY = (height + padding.top + padding.bottom) / (inputResolutionHeight);
const scaleX = (width + padding.left + padding.right) / (inputResolutionWidth);
const scaledPoses = scalePoses(poses, scaleY, scaleX, -padding.top, -padding.left);
return scaledPoses;
}
exports.scaleAndFlipPoses = scaleAndFlipPoses;

52
src/posenet/vectors.js Normal file
View File

@ -0,0 +1,52 @@
const kpt = require('./keypoints');
function getOffsetPoint(y, x, keypoint, offsets) {
return {
y: offsets.get(y, x, keypoint),
x: offsets.get(y, x, keypoint + kpt.NUM_KEYPOINTS),
};
}
exports.getOffsetPoint = getOffsetPoint;
function getImageCoords(part, outputStride, offsets) {
const { heatmapY, heatmapX, id: keypoint } = part;
const { y, x } = getOffsetPoint(heatmapY, heatmapX, keypoint, offsets);
return {
x: part.heatmapX * outputStride + x,
y: part.heatmapY * outputStride + y,
};
}
exports.getImageCoords = getImageCoords;
function fillArray(element, size) {
const result = new Array(size);
for (let i = 0; i < size; i++) {
result[i] = element;
}
return result;
}
exports.fillArray = fillArray;
function clamp(a, min, max) {
if (a < min) return min;
if (a > max) return max;
return a;
}
exports.clamp = clamp;
function squaredDistance(y1, x1, y2, x2) {
const dy = y2 - y1;
const dx = x2 - x1;
return dy * dy + dx * dx;
}
exports.squaredDistance = squaredDistance;
function addVectors(a, b) {
return { x: a.x + b.x, y: a.y + b.y };
}
exports.addVectors = addVectors;
function clampVector(a, min, max) {
return { y: clamp(a.y, min, max), x: clamp(a.x, min, max) };
}
exports.clampVector = clampVector;

50
src/ssrnet/index.js Normal file
View File

@ -0,0 +1,50 @@
const tf = require('@tensorflow/tfjs');
const models = {};
let last = { age: 0, gender: '' };
let frame = 0;
async function getImage(image, size) {
const tensor = tf.tidy(() => {
const buffer = tf.browser.fromPixels(image);
const resize = tf.image.resizeBilinear(buffer, [size, size]);
const expand = tf.cast(tf.expandDims(resize, 0), 'float32');
// const normalize = tf.mul(expand, [1.0 / 1.0]);
return expand;
});
return tensor;
}
async function predict(image, config) {
frame += 1;
if (frame >= config.face.age.skipFrames) {
frame = 0;
return last;
}
if (!models.age && config.face.age.enabled) models.age = await tf.loadGraphModel(config.face.age.modelPath);
if (!models.gender && config.face.gender.enabled) models.gender = await tf.loadGraphModel(config.face.gender.modelPath);
let enhance;
if (image instanceof tf.Tensor) {
const resize = tf.image.resizeBilinear(image, [config.face.age.inputSize, config.face.age.inputSize], false);
enhance = tf.mul(resize, [255.0]);
tf.dispose(resize);
} else {
enhance = await getImage(image, config.face.age.inputSize);
}
const obj = {};
if (config.face.age.enabled) {
const ageT = await models.age.predict(enhance);
obj.age = Math.trunc(10 * ageT.dataSync()[0]) / 10;
tf.dispose(ageT);
}
if (config.face.gender.enabled) {
const genderT = await models.gender.predict(enhance);
obj.gender = Math.trunc(100 * genderT.dataSync()[0]) < 50 ? 'female' : 'male';
tf.dispose(genderT);
}
tf.dispose(enhance);
last = obj;
return obj;
}
exports.predict = predict;

169
src/triangulation.js Normal file
View File

@ -0,0 +1,169 @@
export default [
127, 34, 139, 11, 0, 37, 232, 231, 120, 72, 37, 39, 128, 121, 47, 232, 121,
128, 104, 69, 67, 175, 171, 148, 157, 154, 155, 118, 50, 101, 73, 39, 40, 9,
151, 108, 48, 115, 131, 194, 204, 211, 74, 40, 185, 80, 42, 183, 40, 92,
186, 230, 229, 118, 202, 212, 214, 83, 18, 17, 76, 61, 146, 160, 29, 30, 56,
157, 173, 106, 204, 194, 135, 214, 192, 203, 165, 98, 21, 71, 68, 51, 45, 4,
144, 24, 23, 77, 146, 91, 205, 50, 187, 201, 200, 18, 91, 106, 182, 90, 91,
181, 85, 84, 17, 206, 203, 36, 148, 171, 140, 92, 40, 39, 193, 189, 244,
159, 158, 28, 247, 246, 161, 236, 3, 196, 54, 68, 104, 193, 168, 8, 117,
228, 31, 189, 193, 55, 98, 97, 99, 126, 47, 100, 166, 79, 218, 155, 154, 26,
209, 49, 131, 135, 136, 150, 47, 126, 217, 223, 52, 53, 45, 51, 134, 211,
170, 140, 67, 69, 108, 43, 106, 91, 230, 119, 120, 226, 130, 247, 63, 53,
52, 238, 20, 242, 46, 70, 156, 78, 62, 96, 46, 53, 63, 143, 34, 227, 173,
155, 133, 123, 117, 111, 44, 125, 19, 236, 134, 51, 216, 206, 205, 154, 153,
22, 39, 37, 167, 200, 201, 208, 36, 142, 100, 57, 212, 202, 20, 60, 99, 28,
158, 157, 35, 226, 113, 160, 159, 27, 204, 202, 210, 113, 225, 46, 43, 202,
204, 62, 76, 77, 137, 123, 116, 41, 38, 72, 203, 129, 142, 64, 98, 240, 49,
102, 64, 41, 73, 74, 212, 216, 207, 42, 74, 184, 169, 170, 211, 170, 149,
176, 105, 66, 69, 122, 6, 168, 123, 147, 187, 96, 77, 90, 65, 55, 107, 89,
90, 180, 101, 100, 120, 63, 105, 104, 93, 137, 227, 15, 86, 85, 129, 102,
49, 14, 87, 86, 55, 8, 9, 100, 47, 121, 145, 23, 22, 88, 89, 179, 6, 122,
196, 88, 95, 96, 138, 172, 136, 215, 58, 172, 115, 48, 219, 42, 80, 81, 195,
3, 51, 43, 146, 61, 171, 175, 199, 81, 82, 38, 53, 46, 225, 144, 163, 110,
246, 33, 7, 52, 65, 66, 229, 228, 117, 34, 127, 234, 107, 108, 69, 109, 108,
151, 48, 64, 235, 62, 78, 191, 129, 209, 126, 111, 35, 143, 163, 161, 246,
117, 123, 50, 222, 65, 52, 19, 125, 141, 221, 55, 65, 3, 195, 197, 25, 7,
33, 220, 237, 44, 70, 71, 139, 122, 193, 245, 247, 130, 33, 71, 21, 162,
153, 158, 159, 170, 169, 150, 188, 174, 196, 216, 186, 92, 144, 160, 161, 2,
97, 167, 141, 125, 241, 164, 167, 37, 72, 38, 12, 145, 159, 160, 38, 82, 13,
63, 68, 71, 226, 35, 111, 158, 153, 154, 101, 50, 205, 206, 92, 165, 209,
198, 217, 165, 167, 97, 220, 115, 218, 133, 112, 243, 239, 238, 241, 214,
135, 169, 190, 173, 133, 171, 208, 32, 125, 44, 237, 86, 87, 178, 85, 86,
179, 84, 85, 180, 83, 84, 181, 201, 83, 182, 137, 93, 132, 76, 62, 183, 61,
76, 184, 57, 61, 185, 212, 57, 186, 214, 207, 187, 34, 143, 156, 79, 239,
237, 123, 137, 177, 44, 1, 4, 201, 194, 32, 64, 102, 129, 213, 215, 138, 59,
166, 219, 242, 99, 97, 2, 94, 141, 75, 59, 235, 24, 110, 228, 25, 130, 226,
23, 24, 229, 22, 23, 230, 26, 22, 231, 112, 26, 232, 189, 190, 243, 221, 56,
190, 28, 56, 221, 27, 28, 222, 29, 27, 223, 30, 29, 224, 247, 30, 225, 238,
79, 20, 166, 59, 75, 60, 75, 240, 147, 177, 215, 20, 79, 166, 187, 147, 213,
112, 233, 244, 233, 128, 245, 128, 114, 188, 114, 217, 174, 131, 115, 220,
217, 198, 236, 198, 131, 134, 177, 132, 58, 143, 35, 124, 110, 163, 7, 228,
110, 25, 356, 389, 368, 11, 302, 267, 452, 350, 349, 302, 303, 269, 357,
343, 277, 452, 453, 357, 333, 332, 297, 175, 152, 377, 384, 398, 382, 347,
348, 330, 303, 304, 270, 9, 336, 337, 278, 279, 360, 418, 262, 431, 304,
408, 409, 310, 415, 407, 270, 409, 410, 450, 348, 347, 422, 430, 434, 313,
314, 17, 306, 307, 375, 387, 388, 260, 286, 414, 398, 335, 406, 418, 364,
367, 416, 423, 358, 327, 251, 284, 298, 281, 5, 4, 373, 374, 253, 307, 320,
321, 425, 427, 411, 421, 313, 18, 321, 405, 406, 320, 404, 405, 315, 16, 17,
426, 425, 266, 377, 400, 369, 322, 391, 269, 417, 465, 464, 386, 257, 258,
466, 260, 388, 456, 399, 419, 284, 332, 333, 417, 285, 8, 346, 340, 261,
413, 441, 285, 327, 460, 328, 355, 371, 329, 392, 439, 438, 382, 341, 256,
429, 420, 360, 364, 394, 379, 277, 343, 437, 443, 444, 283, 275, 440, 363,
431, 262, 369, 297, 338, 337, 273, 375, 321, 450, 451, 349, 446, 342, 467,
293, 334, 282, 458, 461, 462, 276, 353, 383, 308, 324, 325, 276, 300, 293,
372, 345, 447, 382, 398, 362, 352, 345, 340, 274, 1, 19, 456, 248, 281, 436,
427, 425, 381, 256, 252, 269, 391, 393, 200, 199, 428, 266, 330, 329, 287,
273, 422, 250, 462, 328, 258, 286, 384, 265, 353, 342, 387, 259, 257, 424,
431, 430, 342, 353, 276, 273, 335, 424, 292, 325, 307, 366, 447, 345, 271,
303, 302, 423, 266, 371, 294, 455, 460, 279, 278, 294, 271, 272, 304, 432,
434, 427, 272, 407, 408, 394, 430, 431, 395, 369, 400, 334, 333, 299, 351,
417, 168, 352, 280, 411, 325, 319, 320, 295, 296, 336, 319, 403, 404, 330,
348, 349, 293, 298, 333, 323, 454, 447, 15, 16, 315, 358, 429, 279, 14, 15,
316, 285, 336, 9, 329, 349, 350, 374, 380, 252, 318, 402, 403, 6, 197, 419,
318, 319, 325, 367, 364, 365, 435, 367, 397, 344, 438, 439, 272, 271, 311,
195, 5, 281, 273, 287, 291, 396, 428, 199, 311, 271, 268, 283, 444, 445,
373, 254, 339, 263, 466, 249, 282, 334, 296, 449, 347, 346, 264, 447, 454,
336, 296, 299, 338, 10, 151, 278, 439, 455, 292, 407, 415, 358, 371, 355,
340, 345, 372, 390, 249, 466, 346, 347, 280, 442, 443, 282, 19, 94, 370,
441, 442, 295, 248, 419, 197, 263, 255, 359, 440, 275, 274, 300, 383, 368,
351, 412, 465, 263, 467, 466, 301, 368, 389, 380, 374, 386, 395, 378, 379,
412, 351, 419, 436, 426, 322, 373, 390, 388, 2, 164, 393, 370, 462, 461,
164, 0, 267, 302, 11, 12, 374, 373, 387, 268, 12, 13, 293, 300, 301, 446,
261, 340, 385, 384, 381, 330, 266, 425, 426, 423, 391, 429, 355, 437, 391,
327, 326, 440, 457, 438, 341, 382, 362, 459, 457, 461, 434, 430, 394, 414,
463, 362, 396, 369, 262, 354, 461, 457, 316, 403, 402, 315, 404, 403, 314,
405, 404, 313, 406, 405, 421, 418, 406, 366, 401, 361, 306, 408, 407, 291,
409, 408, 287, 410, 409, 432, 436, 410, 434, 416, 411, 264, 368, 383, 309,
438, 457, 352, 376, 401, 274, 275, 4, 421, 428, 262, 294, 327, 358, 433,
416, 367, 289, 455, 439, 462, 370, 326, 2, 326, 370, 305, 460, 455, 254,
449, 448, 255, 261, 446, 253, 450, 449, 252, 451, 450, 256, 452, 451, 341,
453, 452, 413, 464, 463, 441, 413, 414, 258, 442, 441, 257, 443, 442, 259,
444, 443, 260, 445, 444, 467, 342, 445, 459, 458, 250, 289, 392, 290, 290,
328, 460, 376, 433, 435, 250, 290, 392, 411, 416, 433, 341, 463, 464, 453,
464, 465, 357, 465, 412, 343, 412, 399, 360, 363, 440, 437, 399, 456, 420,
456, 363, 401, 435, 288, 372, 383, 353, 339, 255, 249, 448, 261, 255, 133,
243, 190, 133, 155, 112, 33, 246, 247, 33, 130, 25, 398, 384, 286, 362, 398,
414, 362, 463, 341, 263, 359, 467, 263, 249, 255, 466, 467, 260, 75, 60,
166, 238, 239, 79, 162, 127, 139, 72, 11, 37, 121, 232, 120, 73, 72, 39,
114, 128, 47, 233, 232, 128, 103, 104, 67, 152, 175, 148, 173, 157, 155,
119, 118, 101, 74, 73, 40, 107, 9, 108, 49, 48, 131, 32, 194, 211, 184, 74,
185, 191, 80, 183, 185, 40, 186, 119, 230, 118, 210, 202, 214, 84, 83, 17,
77, 76, 146, 161, 160, 30, 190, 56, 173, 182, 106, 194, 138, 135, 192, 129,
203, 98, 54, 21, 68, 5, 51, 4, 145, 144, 23, 90, 77, 91, 207, 205, 187, 83,
201, 18, 181, 91, 182, 180, 90, 181, 16, 85, 17, 205, 206, 36, 176, 148,
140, 165, 92, 39, 245, 193, 244, 27, 159, 28, 30, 247, 161, 174, 236, 196,
103, 54, 104, 55, 193, 8, 111, 117, 31, 221, 189, 55, 240, 98, 99, 142, 126,
100, 219, 166, 218, 112, 155, 26, 198, 209, 131, 169, 135, 150, 114, 47,
217, 224, 223, 53, 220, 45, 134, 32, 211, 140, 109, 67, 108, 146, 43, 91,
231, 230, 120, 113, 226, 247, 105, 63, 52, 241, 238, 242, 124, 46, 156, 95,
78, 96, 70, 46, 63, 116, 143, 227, 116, 123, 111, 1, 44, 19, 3, 236, 51,
207, 216, 205, 26, 154, 22, 165, 39, 167, 199, 200, 208, 101, 36, 100, 43,
57, 202, 242, 20, 99, 56, 28, 157, 124, 35, 113, 29, 160, 27, 211, 204, 210,
124, 113, 46, 106, 43, 204, 96, 62, 77, 227, 137, 116, 73, 41, 72, 36, 203,
142, 235, 64, 240, 48, 49, 64, 42, 41, 74, 214, 212, 207, 183, 42, 184, 210,
169, 211, 140, 170, 176, 104, 105, 69, 193, 122, 168, 50, 123, 187, 89, 96,
90, 66, 65, 107, 179, 89, 180, 119, 101, 120, 68, 63, 104, 234, 93, 227, 16,
15, 85, 209, 129, 49, 15, 14, 86, 107, 55, 9, 120, 100, 121, 153, 145, 22,
178, 88, 179, 197, 6, 196, 89, 88, 96, 135, 138, 136, 138, 215, 172, 218,
115, 219, 41, 42, 81, 5, 195, 51, 57, 43, 61, 208, 171, 199, 41, 81, 38,
224, 53, 225, 24, 144, 110, 105, 52, 66, 118, 229, 117, 227, 34, 234, 66,
107, 69, 10, 109, 151, 219, 48, 235, 183, 62, 191, 142, 129, 126, 116, 111,
143, 7, 163, 246, 118, 117, 50, 223, 222, 52, 94, 19, 141, 222, 221, 65,
196, 3, 197, 45, 220, 44, 156, 70, 139, 188, 122, 245, 139, 71, 162, 145,
153, 159, 149, 170, 150, 122, 188, 196, 206, 216, 92, 163, 144, 161, 164, 2,
167, 242, 141, 241, 0, 164, 37, 11, 72, 12, 144, 145, 160, 12, 38, 13, 70,
63, 71, 31, 226, 111, 157, 158, 154, 36, 101, 205, 203, 206, 165, 126, 209,
217, 98, 165, 97, 237, 220, 218, 237, 239, 241, 210, 214, 169, 140, 171, 32,
241, 125, 237, 179, 86, 178, 180, 85, 179, 181, 84, 180, 182, 83, 181, 194,
201, 182, 177, 137, 132, 184, 76, 183, 185, 61, 184, 186, 57, 185, 216, 212,
186, 192, 214, 187, 139, 34, 156, 218, 79, 237, 147, 123, 177, 45, 44, 4,
208, 201, 32, 98, 64, 129, 192, 213, 138, 235, 59, 219, 141, 242, 97, 97, 2,
141, 240, 75, 235, 229, 24, 228, 31, 25, 226, 230, 23, 229, 231, 22, 230,
232, 26, 231, 233, 112, 232, 244, 189, 243, 189, 221, 190, 222, 28, 221,
223, 27, 222, 224, 29, 223, 225, 30, 224, 113, 247, 225, 99, 60, 240, 213,
147, 215, 60, 20, 166, 192, 187, 213, 243, 112, 244, 244, 233, 245, 245,
128, 188, 188, 114, 174, 134, 131, 220, 174, 217, 236, 236, 198, 134, 215,
177, 58, 156, 143, 124, 25, 110, 7, 31, 228, 25, 264, 356, 368, 0, 11, 267,
451, 452, 349, 267, 302, 269, 350, 357, 277, 350, 452, 357, 299, 333, 297,
396, 175, 377, 381, 384, 382, 280, 347, 330, 269, 303, 270, 151, 9, 337,
344, 278, 360, 424, 418, 431, 270, 304, 409, 272, 310, 407, 322, 270, 410,
449, 450, 347, 432, 422, 434, 18, 313, 17, 291, 306, 375, 259, 387, 260,
424, 335, 418, 434, 364, 416, 391, 423, 327, 301, 251, 298, 275, 281, 4,
254, 373, 253, 375, 307, 321, 280, 425, 411, 200, 421, 18, 335, 321, 406,
321, 320, 405, 314, 315, 17, 423, 426, 266, 396, 377, 369, 270, 322, 269,
413, 417, 464, 385, 386, 258, 248, 456, 419, 298, 284, 333, 168, 417, 8,
448, 346, 261, 417, 413, 285, 326, 327, 328, 277, 355, 329, 309, 392, 438,
381, 382, 256, 279, 429, 360, 365, 364, 379, 355, 277, 437, 282, 443, 283,
281, 275, 363, 395, 431, 369, 299, 297, 337, 335, 273, 321, 348, 450, 349,
359, 446, 467, 283, 293, 282, 250, 458, 462, 300, 276, 383, 292, 308, 325,
283, 276, 293, 264, 372, 447, 346, 352, 340, 354, 274, 19, 363, 456, 281,
426, 436, 425, 380, 381, 252, 267, 269, 393, 421, 200, 428, 371, 266, 329,
432, 287, 422, 290, 250, 328, 385, 258, 384, 446, 265, 342, 386, 387, 257,
422, 424, 430, 445, 342, 276, 422, 273, 424, 306, 292, 307, 352, 366, 345,
268, 271, 302, 358, 423, 371, 327, 294, 460, 331, 279, 294, 303, 271, 304,
436, 432, 427, 304, 272, 408, 395, 394, 431, 378, 395, 400, 296, 334, 299,
6, 351, 168, 376, 352, 411, 307, 325, 320, 285, 295, 336, 320, 319, 404,
329, 330, 349, 334, 293, 333, 366, 323, 447, 316, 15, 315, 331, 358, 279,
317, 14, 316, 8, 285, 9, 277, 329, 350, 253, 374, 252, 319, 318, 403, 351,
6, 419, 324, 318, 325, 397, 367, 365, 288, 435, 397, 278, 344, 439, 310,
272, 311, 248, 195, 281, 375, 273, 291, 175, 396, 199, 312, 311, 268, 276,
283, 445, 390, 373, 339, 295, 282, 296, 448, 449, 346, 356, 264, 454, 337,
336, 299, 337, 338, 151, 294, 278, 455, 308, 292, 415, 429, 358, 355, 265,
340, 372, 388, 390, 466, 352, 346, 280, 295, 442, 282, 354, 19, 370, 285,
441, 295, 195, 248, 197, 457, 440, 274, 301, 300, 368, 417, 351, 465, 251,
301, 389, 385, 380, 386, 394, 395, 379, 399, 412, 419, 410, 436, 322, 387,
373, 388, 326, 2, 393, 354, 370, 461, 393, 164, 267, 268, 302, 12, 386, 374,
387, 312, 268, 13, 298, 293, 301, 265, 446, 340, 380, 385, 381, 280, 330,
425, 322, 426, 391, 420, 429, 437, 393, 391, 326, 344, 440, 438, 458, 459,
461, 364, 434, 394, 428, 396, 262, 274, 354, 457, 317, 316, 402, 316, 315,
403, 315, 314, 404, 314, 313, 405, 313, 421, 406, 323, 366, 361, 292, 306,
407, 306, 291, 408, 291, 287, 409, 287, 432, 410, 427, 434, 411, 372, 264,
383, 459, 309, 457, 366, 352, 401, 1, 274, 4, 418, 421, 262, 331, 294, 358,
435, 433, 367, 392, 289, 439, 328, 462, 326, 94, 2, 370, 289, 305, 455, 339,
254, 448, 359, 255, 446, 254, 253, 449, 253, 252, 450, 252, 256, 451, 256,
341, 452, 414, 413, 463, 286, 441, 414, 286, 258, 441, 258, 257, 442, 257,
259, 443, 259, 260, 444, 260, 467, 445, 309, 459, 250, 305, 289, 290, 305,
290, 460, 401, 376, 435, 309, 250, 392, 376, 411, 433, 453, 341, 464, 357,
453, 465, 343, 357, 412, 437, 343, 399, 344, 360, 440, 420, 437, 456, 360,
420, 363, 361, 401, 288, 265, 372, 353, 390, 339, 249, 339, 448, 255];

BIN
wiki/group1-shard1of1.bin Normal file

Binary file not shown.

1
wiki/model.json Normal file

File diff suppressed because one or more lines are too long