Human: AI-powered 3D Face Detection & Rotation Tracking, Face Description & Recognition, Body Pose Tracking, 3D Hand & Finger Tracking, Iris Analysis, Age & Gender & Emotion Prediction, Gaze Tracking, Gesture Recognition

age-estimation body-segmentation body-tracking emotion-detection face-detection face-matching face-mesh face-position face-recognition faceid gaze-tracking gender-prediction gesture-recognition hand-tracking iris-tracking tensorflowjs tfjs

Go to file

Vladimir Mandic 8d91c9237d fixed worker and filter compatibility		2020-11-05 08:21:23 -05:00
assets	redo hand detection	2020-11-04 01:11:24 -05:00
demo	fixed worker and filter compatibility	2020-11-05 08:21:23 -05:00
dist	fixed worker and filter compatibility	2020-11-05 08:21:23 -05:00
models	complete model refactoring	2020-11-03 09:34:36 -05:00
src	fixed worker and filter compatibility	2020-11-05 08:21:23 -05:00
.eslintrc.json	optimized demos and added scoped runs	2020-10-15 09:43:16 -04:00
.gitignore	initial public commit	2020-10-11 19:22:43 -04:00
CHANGELOG.md	updated changelog	2020-11-04 11:44:07 -05:00
LICENSE	Initial commit	2020-10-11 19:14:20 -04:00
README.md	updated mobile build	2020-11-04 12:10:26 -05:00
changelog.js	added auto-generated changelog	2020-10-16 15:21:56 -04:00
config.js	changed demo build process	2020-11-04 11:43:51 -05:00
favicon.ico	refactored package file layout	2020-10-17 06:30:00 -04:00
package-lock.json	0.7.2	2020-11-04 14:59:33 -05:00
package.json	0.7.2	2020-11-04 14:59:33 -05:00

README.md

Human Library

3D Face Detection, Body Pose, Hand & Finger Tracking, Iris Tracking, Age & Gender Prediction, Emotion Prediction & Gesture Recognition

Compatible with Browser, WebWorker and NodeJS execution
Compatible with CPU, WebGL, WASM and WebGPU backends
(and maybe with React-Native as it doesn't use any DOM objects)

This is a pre-release project, see issues for list of known limitations and planned enhancements

Suggestions are welcome!

Examples

Using static images:

Using webcam:

Installation

Important
The packaged (IIFE and ESM) version of Human includes TensorFlow/JS (TFJS) 2.7.0 library which can be accessed via human.tf
You should NOT manually load another instance of tfjs, but if you do, be aware of possible version conflicts

There are multiple ways to use Human library, pick one that suits you:

Included

dist/human.js: IIFE format bundle with TFJS for Browsers
dist/human.esm.js: ESM format bundle with TFJS for Browsers
dist/human.esm-nobundle.js: ESM format bundle without TFJS for Browsers
dist/human.node.js: CommonJS format bundle with TFJS for NodeJS
dist/human.node-nobundle.js: CommonJS format bundle without TFJS for NodeJS

All versions include sourcemap (.map) and build manifest (.json)
While Human is in pre-release mode, all bundles are non-minified

Defaults:

  {
    "main": "dist/human.node.js",
    "module": "dist/human.esm.js",
    "browser": "dist/human.esm.js",
  }

1. IIFE script

Simplest way for usage within Browser

Simply download dist/human.js, include it in your HTML file & it's ready to use.

  <script src="dist/human.js"><script>

IIFE script auto-registers global namespace Human within global Window object
Which you can use to create instance of human library:

  const human = new Human();

This way you can also use Human library within embbedded <script> tag within your html page for all-in-one approach

2. ESM module

Recommended for usage within Browser

2.1 Using Script Module

You could use same syntax within your main JS file if it's imported with <script type="module">

  <script src="./index.js" type="module">

and then in your index.js

  import Human from 'dist/human.esm.js'; // for direct import must use path to module, not package name
  const human = new Human();

2.2 With Bundler

If you're using bundler (such as rollup, webpack, parcel, browserify, esbuild) to package your client application,
you can import ESM version of Human library which supports full tree shaking

Install with:

  npm install @vladmandic/human

  import Human from '@vladmandic/human'; // points to @vladmandic/human/dist/human.esm.js
                                         // you can also force-load specific version
                                         // for example: `@vladmandic/human/dist/human.esm-nobundle.js`
  const human = new Human();

Or if you prefer to package your version of tfjs, you can use nobundle version

Install with:

  npm install @vladmandic/human @tensorflow/tfjs-node

  import tf from '@tensorflow/tfjs'
  import Human from '@vladmandic/human/dist/human.esm-nobundle.js'; // same functionality as default import, but without tfjs bundled
  const human = new Human();

3. NPM module

Recommended for NodeJS projects that will execute in the backend

Entry point is bundle in CommonJS format dist/human.node.js
You also need to install and include tfjs-node or tfjs-node-gpu in your project so it can register an optimized backend

Install with:

  npm install @vladmandic/human @tensorflow/tfjs-node

And then use with:

  const tf = require('@tensorflow/tfjs-node'); // can also use '@tensorflow/tfjs-node-gpu' if you have environment with CUDA extensions
  const Human = require('@vladmandic/human').default; // points to @vladmandic/human/dist/human.node.js
  const human = new Human();

Since NodeJS projects load weights from local filesystem instead of using http calls, you must modify default configuration to include correct paths with file:// prefix

For example:

const config = {
  body: { enabled: true, modelPath: 'file://models/posenet/model.json' },
}

Weights

Pretrained model weights are includes in ./models
Default configuration uses relative paths to you entry script pointing to ../models
If your application resides in a different folder, modify modelPath property in configuration of each module

Demo

Demos are included in /demo:

Browser:

index.html: Full demo using Browser with ESM module, includes selectable backends and webworkers
it loads dist/demo-browser-index.js which is built from sources in demo, starting with demo/browser
alternatively you can load demo/browser.js directly

If you want to test wasm or webgpu backends, enable loading in index.html

NodeJS:

node.js: Demo using NodeJS with CommonJS module
This is a very simple demo as althought Human library is compatible with NodeJS execution
and is able to load images and models from local filesystem,

Usage

Human library does not require special initialization. All configuration is done in a single JSON object and all model weights will be dynamically loaded upon their first usage
(and only then, Human will not load weights that it doesn't need according to configuration).

There is only ONE method you need:

  // 'image': can be of any type of an image object: HTMLImage, HTMLVideo, HTMLMedia, Canvas, Tensor4D  
  // 'config': optional parameter used to override any options present in default configuration  
  // configuration is fully dynamic and can change between different calls to 'detect()'  
  const result = await human.detect(image, config?)

or if you want to use promises

  human.detect(image, config?).then((result) => {
    // your code
  })

Additionally, Human library exposes several objects and methods:

  human.config        // access to configuration object, normally set as parameter to detect()
  human.defaults      // read-only view of default configuration object
  human.models        // dynamically maintained list of object of any loaded models
  human.tf            // instance of tfjs used by human
  human.state         // <string> describing current operation in progress
                      // progresses through: 'config', 'check', 'backend', 'load', 'run:<model>', 'idle'
  human.load(config)  // explicitly call load method that loads configured models
                      // if you want to pre-load them instead of on-demand loading during 'human.detect()'

Note that when using Human library in NodeJS, you must load and parse the image before you pass it for detection and dispose it afterwards
Input format is Tensor4D[1, width, height, 3] of type float32

For example:

  const imageFile = '../assets/sample1.jpg';
  const buffer = fs.readFileSync(imageFile);
  const decoded = tf.node.decodeImage(buffer);
  const casted = decoded.toFloat();
  const image = casted.expandDims(0);
  decoded.dispose();
  casted.dispose();
  logger.log('Processing:', image.shape);
  const human = new Human.Human();
  const result = await human.detect(image, config);
  image.dispose();

Configuration

Detailed configuration options are explained below, but they are best seen in the menus present in the demo application:

Below is output of human.defaults object
Any property can be overriden by passing user object during human.detect()
Note that user object and default configuration are merged using deep-merge, so you do not need to redefine entire configuration

All configuration details can be changed in real-time!

config = {
  backend: 'webgl',          // select tfjs backend to use
  console: true,             // enable debugging output to console
  async: false,              // execute enabled models in parallel
                             // this disables per-model performance data but slightly increases performance
                             // cannot be used if profiling is enabled
  profile: false,            // enable tfjs profiling
                             // this has significant performance impact, only enable for debugging purposes
                             // currently only implemented for age,gender,emotion models
  deallocate: false,         // aggresively deallocate gpu memory after each usage
                             // only valid for webgl backend and only during first call, cannot be changed unless library is reloaded
                             // this has significant performance impact, only enable on low-memory devices
  scoped: false,             // enable scoped runs
                             // some models *may* have memory leaks, this wrapps everything in a local scope at a cost of performance
                             // typically not needed
  videoOptimized: true,      // perform additional optimizations when input is video, must be disabled for images
  filter: {                  // note: image filters are only available in Browser environments and not in NodeJS as they require WebGL for processing
    enabled: true,           // enable image pre-processing filters
    return: true,            // return processed canvas imagedata in result
    width: 0,                // resize input width
    height: 0,               // resize input height
                             // usefull on low-performance devices to reduce the size of processed input
                             // if both width and height are set to 0, there is no resizing
                             // if just one is set, second one is scaled automatically
                             // if both are set, values are used as-is
    brightness: 0,           // range: -1 (darken) to 1 (lighten)
    contrast: 0,             // range: -1 (reduce contrast) to 1 (increase contrast)
    sharpness: 0,            // range: 0 (no sharpening) to 1 (maximum sharpening)
    blur: 0,                 // range: 0 (no blur) to N (blur radius in pixels)
    saturation: 0,           // range: -1 (reduce saturation) to 1 (increase saturation)
    hue: 0,                  // range: 0 (no change) to 360 (hue rotation in degrees)
    negative: false,         // image negative
    sepia: false,            // image sepia colors
    vintage: false,          // image vintage colors
    kodachrome: false,       // image kodachrome colors
    technicolor: false,      // image technicolor colors
    polaroid: false,         // image polaroid camera effect
    pixelate: 0,             // range: 0 (no pixelate) to N (number of pixels to pixelate)
  },
  face: {
    enabled: true,           // controls if specified modul is enabled
                             // face.enabled is required for all face models: detector, mesh, iris, age, gender, emotion
                             // note: module is not loaded until it is required
    detector: {
      modelPath: '../models/blazeface/back/model.json', // can be 'front' or 'back'.
                                                        // 'front' is optimized for large faces such as front-facing camera and 'back' is optimized for distanct faces.
      inputSize: 256,        // fixed value: 128 for front and 256 for 'back'
      maxFaces: 10,          // maximum number of faces detected in the input, should be set to the minimum number for performance
      skipFrames: 10,        // how many frames to go without re-running the face bounding box detector
                             // only used for video inputs, ignored for static inputs
                             // if model is running st 25 FPS, we can re-use existing bounding box for updated face mesh analysis
                             // as the face probably hasn't moved much in short time (10 * 1/25 = 0.25 sec)
      minConfidence: 0.5,    // threshold for discarding a prediction
      iouThreshold: 0.3,     // threshold for deciding whether boxes overlap too much in non-maximum suppression
      scoreThreshold: 0.7,   // threshold for deciding when to remove boxes based on score in non-maximum suppression
    },
    mesh: {
      enabled: true,
      modelPath: '../models/facemesh/model.json',
      inputSize: 192,        // fixed value
    },
    iris: {
      enabled: true,
      modelPath: '../models/iris/model.json',
      enlargeFactor: 2.3,    // empiric tuning
      inputSize: 64,         // fixed value
    },
    age: {
      enabled: true,
      modelPath: '../models/ssrnet-age/imdb/model.json', // can be 'imdb' or 'wiki'
                                                         // which determines training set for model
      inputSize: 64,         // fixed value
      skipFrames: 10,        // how many frames to go without re-running the detector, only used for video inputs
    },
    gender: {
      enabled: true,
      minConfidence: 0.8,    // threshold for discarding a prediction
      modelPath: '../models/ssrnet-gender/imdb/model.json',
    },
    emotion: {
      enabled: true,
      inputSize: 64,         // fixed value
      minConfidence: 0.5,    // threshold for discarding a prediction
      skipFrames: 10,        // how many frames to go without re-running the detector, only used for video inputs
      modelPath: '../models/emotion/model.json',
    },
  },
  body: {
    enabled: true,
    modelPath: '../models/posenet/model.json',
    inputResolution: 257,    // fixed value
    outputStride: 16,        // fixed value
    maxDetections: 10,       // maximum number of people detected in the input, should be set to the minimum number for performance
    scoreThreshold: 0.7,     // threshold for deciding when to remove boxes based on score in non-maximum suppression
    nmsRadius: 20,           // radius for deciding points are too close in non-maximum suppression
  },
  hand: {
    enabled: true,
    inputSize: 256,          // fixed value
    skipFrames: 10,          // how many frames to go without re-running the hand bounding box detector
                             // only used for video inputs
                             // if model is running st 25 FPS, we can re-use existing bounding box for updated hand skeleton analysis
                             // as the hand probably hasn't moved much in short time (10 * 1/25 = 0.25 sec)
    minConfidence: 0.5,      // threshold for discarding a prediction
    iouThreshold: 0.3,       // threshold for deciding whether boxes overlap too much in non-maximum suppression
    scoreThreshold: 0.7,     // threshold for deciding when to remove boxes based on score in non-maximum suppression
    enlargeFactor: 1.65,     // empiric tuning as skeleton prediction prefers hand box with some whitespace
    maxHands: 10,            // maximum number of hands detected in the input, should be set to the minimum number for performance
    detector: {
      modelPath: '../models/handdetect/model.json',
    },
    skeleton: {
      modelPath: '../models/handskeleton/model.json',
    },
  },
  gesture: {
    enabled: true,           // enable simple gesture recognition
                             // takes processed data and based on geometry detects simple gestures
                             // easily expandable via code, see `src/gesture.js`
  },
};

Any user configuration and default configuration are merged using deep-merge, so you do not need to redefine entire configuration
Configurtion object is large, but typically you only need to modify few values:

enabled: Choose which models to use
modelPath: Update as needed to reflect your application's relative path

for example,

const myConfig = {
  backend: 'wasm',
  filter: { enabled: false },
}
const result = await human.detect(image, myConfig)

Outputs

Result of humand.detect() is a single object that includes data for all enabled modules and all detected objects:

result = {
  version:         // <string> version string of the human library
  face:            // <array of detected objects>
  [
    {
      confidence,  // <number>
      box,         // <array [x, y, width, height]>
      mesh,        // <array of 3D points [x, y, z]> 468 base points & 10 iris points
      annotations, // <list of object { landmark: array of points }> 32 base annotated landmarks & 2 iris annotations
      iris,        // <number> relative distance of iris to camera, multiple by focal lenght to get actual distance
      age,         // <number> estimated age
      gender,      // <string> 'male', 'female'
    }
  ],
  body:            // <array of detected objects>
  [
    {
      score,       // <number>,
      keypoints,   // <array of 2D landmarks [ score, landmark, position [x, y] ]> 17 annotated landmarks
    }
  ],
  hand:            // <array of detected objects>
  [
    {
      confidence,  // <number>,
      box,         // <array [x, y, width, height]>,
      landmarks,   // <array of 3D points [x, y,z]> 21 points
      annotations, // <array of 3D landmarks [ landmark: <array of points> ]> 5 annotated landmakrs
    }
  ],
  emotion:         // <array of emotions>
  [
    {
      score,       // <number> probabily of emotion
      emotion,     // <string> 'angry', 'discust', 'fear', 'happy', 'sad', 'surpise', 'neutral'
    }
  ],
  gesture:         // object containing parsed gestures
  {
    face,          // <array of string>
    body,          // <array of string>
    hand,          // <array of string>
  }
  performance = {  // performance data of last execution for each module measuredin miliseconds
    backend,       // time to initialize tf backend, valid only during backend startup
    load,          // time to load models, valid only during model load
    image,         // time for image processing
    gesture,       // gesture analysis time
    body,          // model time
    hand,          // model time
    face,          // model time
    agegender,     // model time
    emotion,       // model time
    total,         // end to end time
  }
}

Profile

If config.profile is enabled, call to human.profile() will return detailed profiling data from the last detect invokation.

example:

  result = {
    {age: {…}, gender: {…}, emotion: {…}}
      age:
        timeKernelOps: 53.78892800000002
        newBytes: 4
        newTensors: 1
        numKernelOps: 341
        peakBytes: 46033948
        largestKernelOps: Array(5)
          0: {name: "Reshape", bytesAdded: 107648, totalBytesSnapshot: 46033948, tensorsAdded: 1, totalTensorsSnapshot: 1149, …}
          1: {name: "Reshape", bytesAdded: 0, totalBytesSnapshot: 45818652, tensorsAdded: 1, totalTensorsSnapshot: 1147, …}
          2: {name: "Reshape", bytesAdded: 0, totalBytesSnapshot: 45633996, tensorsAdded: 1, totalTensorsSnapshot: 1148, …}
          3: {name: "Reshape", bytesAdded: 0, totalBytesSnapshot: 45389376, tensorsAdded: 1, totalTensorsSnapshot: 1154, …}
          4: {name: "Reshape", bytesAdded: 53824, totalBytesSnapshot: 45381776, tensorsAdded: 1, totalTensorsSnapshot: 1155, …}
        slowestKernelOps: Array(5)
          0: {name: "_FusedMatMul", bytesAdded: 12, totalBytesSnapshot: 44802280, tensorsAdded: 1, totalTensorsSnapshot: 1156, …}
          1: {name: "_FusedMatMul", bytesAdded: 4, totalBytesSnapshot: 44727564, tensorsAdded: 1, totalTensorsSnapshot: 1152, …}
          2: {name: "_FusedMatMul", bytesAdded: 12, totalBytesSnapshot: 44789100, tensorsAdded: 1, totalTensorsSnapshot: 1157, …}
          3: {name: "Add", bytesAdded: 4, totalBytesSnapshot: 44788748, tensorsAdded: 1, totalTensorsSnapshot: 1158, …}
          4: {name: "Add", bytesAdded: 4, totalBytesSnapshot: 44788748, tensorsAdded: 1, totalTensorsSnapshot: 1158, …}
  }

Build

If you want to modify the library and perform a full rebuild:

clone repository, install dependencies, check for errors and run full rebuild from which creates bundles from /src into /dist:

git clone https://github.com/vladmandic/human
cd human
npm install # installs all project dependencies
npm run lint
npm run build

This will rebuild library itself (all variations) as well as demo

Project is written in pure JavaScript ECMAScript version 2020

Only project depdendency is @tensorflow/tfjs Development dependencies are eslint used for code linting and esbuild used for IIFE and ESM script bundling

Performance

Performance will vary depending on your hardware, but also on number of resolution of input video/image, enabled modules as well as their parameters

For example, it can perform multiple face detections at 60+ FPS, but drops to ~15 FPS on a medium complex images if all modules are enabled

Performance per module on a notebook with nVidia GTX1050 GPU on a FullHD input:

Enabled all: 15 FPS
Image filters: 80 FPS (standalone)
Gesture: 80 FPS (standalone)
Face Detect: 80 FPS (standalone)
Face Geometry: 30 FPS (includes face detect)
Face Iris: 30 FPS (includes face detect and face geometry)
Age: 60 FPS (includes face detect)
Gender: 60 FPS (includes face detect)
Emotion: 60 FPS (includes face detect)
Hand: 40 FPS (standalone)
Body: 50 FPS (standalone)

Performance per module on a smartphone with Snapdragon 855 on a FullHD input:

Enabled all: 5 FPS
Image filters: 30 FPS (standalone)
Gesture: 30 FPS (standalone)
Face Detect: 20 FPS (standalone)
Face Geometry: 10 FPS (includes face detect)
Face Iris: 5 FPS (includes face detect and face geometry)
Age: 20 FPS (includes face detect)
Gender: 20 FPS (includes face detect)
Emotion: 20 FPS (includes face detect)
Hand: 40 FPS (standalone)
Body: 10 FPS (standalone)

For performance details, see output of result.performance object during after running inference

Credits

Face Detection: MediaPipe BlazeFace
Facial Spacial Geometry: MediaPipe FaceMesh
Eye Iris Details: MediaPipe Iris
Hand Detection & Skeleton: MediaPipe HandPose
Body Pose Detection: PoseNet
Age & Gender Prediction: SSR-Net
Emotion Prediction: Oarriaga
Image Filters: WebGLImageFilter