mirror of https://github.com/vladmandic/human
cleanup config and results docs
parent
5e874c0761
commit
5e89af1004
|
@ -0,0 +1,57 @@
|
|||
# Configuration
|
||||
|
||||
`Human` configuration is a simple object that can be passed as a constructor and overriden during any `human.detect()` call
|
||||
|
||||
- [**Configuration Interface Specification**](https://vladmandic.github.io/human/typedoc/interfaces/Config.html)
|
||||
- [**Configuration Interface Definition**](https://github.com/vladmandic/human/blob/main/src/config.ts#L183)
|
||||
- [**Default configuration values**](https://github.com/vladmandic/human/blob/main/src/config.ts#L253)
|
||||
|
||||
<br>
|
||||
|
||||
Overview of `Config` object type:
|
||||
|
||||
```ts
|
||||
interface Config {
|
||||
backend: string // backend engine to be used for processing
|
||||
wasmPath: string, // root download path for wasm binaries
|
||||
debug: boolean, // verbose messages to console
|
||||
async: boolean, // use asynchronous processing within human
|
||||
warmup: string, // optional warmup of backend engine
|
||||
modelBasePath: string, // root download path for all models
|
||||
cacheSensitivity: number; // threshold for result cache validation
|
||||
skipAllowed: boolean; // *internal* set after cache validation check
|
||||
filter: FilterConfig, // controls input pre-processing
|
||||
face: FaceConfig, // controls face detection and all modules that rely on detected face
|
||||
body: BodyConfig, // controls body pose detection
|
||||
hand: HandConfig, // contros hand and finger detection
|
||||
object: ObjectConfig, // controls object detection
|
||||
gesture: GestureConfig; // controls gesture analysis
|
||||
segmentation: SegmentationConfig, // controls body segmentation
|
||||
}
|
||||
```
|
||||
|
||||
<br>
|
||||
|
||||
Most of configuration options are exposed in the `demo` application UI:
|
||||
|
||||
<br>
|
||||
|
||||

|
||||
|
||||
<br>
|
||||
|
||||
Configurtion object is large, but typically you only need to modify few values:
|
||||
|
||||
- `enabled`: Choose which models to use
|
||||
- `baseModelPath`: Update as needed to reflect your application's relative path
|
||||
|
||||
for example,
|
||||
|
||||
```js
|
||||
const myConfig = {
|
||||
baseModelPath: `https://cdn.jsdelivr.net/npm/@vladmandic/human@2.3.5/dist/human.esm.min.js`,
|
||||
segmentation: { enabled: true },
|
||||
}
|
||||
const human = new Human(myConfig);
|
||||
const result = await human.detect(input);
|
||||
```
|
236
Configuration.md
236
Configuration.md
|
@ -1,236 +0,0 @@
|
|||
# Configuration
|
||||
|
||||
[**Configuration Interface Specification**](https://vladmandic.github.io/human/typedoc/interfaces/Config.html)
|
||||
|
||||
Detailed configuration options are explained below, but they are best seen in the menus present in the `demo` application:
|
||||
*note: some advanced configuration options are not exposed in the UI*
|
||||
|
||||
<br>
|
||||
|
||||

|
||||
|
||||
<br>
|
||||
|
||||
Main configuration objects are:
|
||||
|
||||
- **config.filter**: controls image pre-processing
|
||||
- **config.face**: controls face detection
|
||||
- **config.body**: controls body pose detection
|
||||
- **config.hand**: contros hand and finger detection
|
||||
|
||||
With **config.face** having several subsections:
|
||||
|
||||
- **config.face.detector**: controls general face detection that all other face modules depend on
|
||||
- **config.face.mesh**: controls facial mesh and landscape detection
|
||||
- **config.face.description**: controls age & gender prediction and face descriptor
|
||||
- **config.face.iris**: controls iris detection
|
||||
- **config.face.emotion**: controls emotion prediction
|
||||
|
||||
<br>
|
||||
|
||||
Below is full output of `human.defaults` object
|
||||
Any property can be overriden by passing user object during `human.detect()`
|
||||
Note that user object and default configuration are merged using deep-merge, so you do not need to redefine entire configuration
|
||||
|
||||
All configuration details can be changed in real-time
|
||||
|
||||
```js
|
||||
const config: Config = {
|
||||
backend: '', // select tfjs backend to use, leave empty to use default backend
|
||||
// for browser environments: 'webgl', 'wasm', 'cpu', or 'humangl' (which is a custom version of webgl)
|
||||
// for nodejs environments: 'tensorflow', 'wasm', 'cpu'
|
||||
// default set to `humangl` for browsers and `tensorflow` for nodejs
|
||||
modelBasePath: '', // base path for all models
|
||||
// default set to `../models/` for browsers and `file://models/` for nodejs
|
||||
wasmPath: '', // path for wasm binaries, only used for backend: wasm
|
||||
// default set to download from jsdeliv during Human class instantiation
|
||||
debug: true, // print additional status messages to console
|
||||
async: true, // execute enabled models in parallel
|
||||
warmup: 'full', // what to use for human.warmup(), can be 'none', 'face', 'full'
|
||||
// warmup pre-initializes all models for faster inference but can take
|
||||
// significant time on startup
|
||||
// only used for `webgl` and `humangl` backends
|
||||
cacheSensitivity: 0.70, // cache sensitivity
|
||||
// values 0..1 where 0.01 means reset cache if input changed more than 1%
|
||||
// set to 0 to disable caching
|
||||
skipAllowed: false, // internal & dynamic
|
||||
filter: { // run input through image filters before inference
|
||||
// image filters run with near-zero latency as they are executed on the GPU
|
||||
enabled: true, // enable image pre-processing filters
|
||||
width: 0, // resize input width
|
||||
height: 0, // resize input height
|
||||
// if both width and height are set to 0, there is no resizing
|
||||
// if just one is set, second one is scaled automatically
|
||||
// if both are set, values are used as-is
|
||||
flip: false, // flip input as mirror image
|
||||
return: true, // return processed canvas imagedata in result
|
||||
brightness: 0, // range: -1 (darken) to 1 (lighten)
|
||||
contrast: 0, // range: -1 (reduce contrast) to 1 (increase contrast)
|
||||
sharpness: 0, // range: 0 (no sharpening) to 1 (maximum sharpening)
|
||||
blur: 0, // range: 0 (no blur) to N (blur radius in pixels)
|
||||
saturation: 0, // range: -1 (reduce saturation) to 1 (increase saturation)
|
||||
hue: 0, // range: 0 (no change) to 360 (hue rotation in degrees)
|
||||
negative: false, // image negative
|
||||
sepia: false, // image sepia colors
|
||||
vintage: false, // image vintage colors
|
||||
kodachrome: false, // image kodachrome colors
|
||||
technicolor: false, // image technicolor colors
|
||||
polaroid: false, // image polaroid camera effect
|
||||
pixelate: 0, // range: 0 (no pixelate) to N (number of pixels to pixelate)
|
||||
},
|
||||
|
||||
gesture: {
|
||||
enabled: true, // enable gesture recognition based on model results
|
||||
},
|
||||
|
||||
face: {
|
||||
enabled: true, // controls if specified modul is enabled
|
||||
// face.enabled is required for all face models:
|
||||
// detector, mesh, iris, age, gender, emotion
|
||||
// (note: module is not loaded until it is required)
|
||||
detector: {
|
||||
modelPath: 'blazeface.json', // detector model, can be absolute path or relative to modelBasePath
|
||||
rotation: true, // use best-guess rotated face image or just box with rotation as-is
|
||||
// false means higher performance, but incorrect mesh mapping if face angle is above 20 degrees
|
||||
// this parameter is not valid in nodejs
|
||||
maxDetected: 1, // maximum number of faces detected in the input
|
||||
// should be set to the minimum number for performance
|
||||
skipFrames: 99, // how many max frames to go without re-running the face bounding box detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
skipTime: 2500, // how many ms to go without re-running the face bounding box detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
minConfidence: 0.2, // threshold for discarding a prediction
|
||||
iouThreshold: 0.1, // ammount of overlap between two detected objects before one object is removed
|
||||
return: false, // return extracted face as tensor
|
||||
// in which case user is reponsible for disposing the tensor
|
||||
},
|
||||
|
||||
mesh: {
|
||||
enabled: true,
|
||||
modelPath: 'facemesh.json', // facemesh model, can be absolute path or relative to modelBasePath
|
||||
},
|
||||
|
||||
iris: {
|
||||
enabled: true,
|
||||
modelPath: 'iris.json', // face iris model
|
||||
// can be either absolute path or relative to modelBasePath
|
||||
},
|
||||
|
||||
emotion: {
|
||||
enabled: true,
|
||||
minConfidence: 0.1, // threshold for discarding a prediction
|
||||
skipFrames: 99, // how max many frames to go without re-running the detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
skipTime: 1500, // how many ms to go without re-running the face bounding box detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
modelPath: 'emotion.json', // face emotion model, can be absolute path or relative to modelBasePath
|
||||
},
|
||||
|
||||
description: {
|
||||
enabled: true, // to improve accuracy of face description extraction it is
|
||||
// recommended to enable detector.rotation and mesh.enabled
|
||||
modelPath: 'faceres.json', // face description model
|
||||
// can be either absolute path or relative to modelBasePath
|
||||
skipFrames: 99, // how many max frames to go without re-running the detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
skipTime: 3000, // how many ms to go without re-running the face bounding box detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
minConfidence: 0.1, // threshold for discarding a prediction
|
||||
},
|
||||
|
||||
antispoof: {
|
||||
enabled: false,
|
||||
skipFrames: 99, // how max many frames to go without re-running the detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
skipTime: 4000, // how many ms to go without re-running the face bounding box detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
modelPath: 'antispoof.json', // face description model
|
||||
// can be either absolute path or relative to modelBasePath
|
||||
},
|
||||
},
|
||||
|
||||
body: {
|
||||
enabled: true,
|
||||
modelPath: 'movenet-lightning.json', // body model, can be absolute path or relative to modelBasePath
|
||||
// can be 'posenet', 'blazepose', 'efficientpose', 'movenet-lightning', 'movenet-thunder'
|
||||
detector: {
|
||||
modelPath: '', // optional body detector
|
||||
},
|
||||
maxDetected: -1, // maximum number of people detected in the input
|
||||
// should be set to the minimum number for performance
|
||||
// only valid for posenet and movenet-multipose as other models detects single pose
|
||||
// set to -1 to autodetect based on number of detected faces
|
||||
minConfidence: 0.3, // threshold for discarding a prediction
|
||||
skipFrames: 1, // how many max frames to go without re-running the detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
skipTime: 200, // how many ms to go without re-running the face bounding box detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
},
|
||||
|
||||
hand: {
|
||||
enabled: true,
|
||||
rotation: true, // use best-guess rotated hand image or just box with rotation as-is
|
||||
// false means higher performance, but incorrect finger mapping if hand is inverted
|
||||
// only valid for `handdetect` variation
|
||||
skipFrames: 99, // how many max frames to go without re-running the hand bounding box detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
skipTime: 2000, // how many ms to go without re-running the face bounding box detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
minConfidence: 0.50, // threshold for discarding a prediction
|
||||
iouThreshold: 0.2, // ammount of overlap between two detected objects before one object is removed
|
||||
maxDetected: -1, // maximum number of hands detected in the input
|
||||
// should be set to the minimum number for performance
|
||||
// set to -1 to autodetect based on number of detected faces
|
||||
landmarks: true, // detect hand landmarks or just hand boundary box
|
||||
detector: {
|
||||
modelPath: 'handtrack.json', // hand detector model, can be absolute path or relative to modelBasePath
|
||||
// can be 'handdetect' or 'handtrack'
|
||||
},
|
||||
skeleton: {
|
||||
modelPath: 'handskeleton.json', // hand skeleton model, can be absolute path or relative to modelBasePath
|
||||
},
|
||||
},
|
||||
|
||||
object: {
|
||||
enabled: false,
|
||||
modelPath: 'mb3-centernet.json', // experimental: object detection model, can be absolute path or relative to modelBasePath
|
||||
// can be 'mb3-centernet' or 'nanodet'
|
||||
minConfidence: 0.2, // threshold for discarding a prediction
|
||||
iouThreshold: 0.4, // ammount of overlap between two detected objects before one object is removed
|
||||
maxDetected: 10, // maximum number of objects detected in the input
|
||||
skipFrames: 99, // how many max frames to go without re-running the detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
skipTime: 1000, // how many ms to go without re-running object detector
|
||||
// only used when cacheSensitivity is not zero
|
||||
},
|
||||
|
||||
segmentation: {
|
||||
enabled: false, // controlls and configures all body segmentation module
|
||||
// removes background from input containing person
|
||||
// if segmentation is enabled it will run as preprocessing task before any other model
|
||||
// alternatively leave it disabled and use it on-demand using human.segmentation method which can
|
||||
// remove background or replace it with user-provided background
|
||||
modelPath: 'selfie.json', // experimental: object detection model, can be absolute path or relative to modelBasePath
|
||||
// can be 'selfie' or 'meet'
|
||||
blur: 8, // blur segmentation output by n pixels for more realistic image
|
||||
},
|
||||
};
|
||||
```
|
||||
|
||||
<br>
|
||||
|
||||
Any user configuration and default configuration are merged using deep-merge, so you do not need to redefine entire configuration
|
||||
Configurtion object is large, but typically you only need to modify few values:
|
||||
|
||||
- `enabled`: Choose which models to use
|
||||
- `modelPath`: Update as needed to reflect your application's relative path
|
||||
|
||||
for example,
|
||||
|
||||
```js
|
||||
const myConfig = {
|
||||
backend: 'wasm',
|
||||
filter: { enabled: false },
|
||||
}
|
||||
const result = await human.detect(image, myConfig)
|
||||
```
|
4
Home.md
4
Home.md
|
@ -42,8 +42,8 @@ JavaScript module using TensorFlow/JS Machine Learning library
|
|||
- [**Home**](https://github.com/vladmandic/human/wiki)
|
||||
- [**Installation**](https://github.com/vladmandic/human/wiki/Install)
|
||||
- [**Usage & Functions**](https://github.com/vladmandic/human/wiki/Usage)
|
||||
- [**Configuration Details**](https://github.com/vladmandic/human/wiki/Configuration)
|
||||
- [**Output Details**](https://github.com/vladmandic/human/wiki/Outputs)
|
||||
- [**Configuration Details**](https://github.com/vladmandic/human/wiki/Config)
|
||||
- [**Result Details**](https://github.com/vladmandic/human/wiki/Result)
|
||||
- [**Caching & Smoothing**](https://github.com/vladmandic/human/wiki/Caching)
|
||||
- [**Face Recognition & Face Description**](https://github.com/vladmandic/human/wiki/Embedding)
|
||||
- [**Gesture Recognition**](https://github.com/vladmandic/human/wiki/Gesture)
|
||||
|
|
134
Outputs.md
134
Outputs.md
|
@ -1,134 +0,0 @@
|
|||
# Outputs
|
||||
|
||||
Result of `humand.detect()` method is a single object that includes data for all enabled modules and all detected objects
|
||||
`Result` object also includes `persons` getter which when invokes sorts results according to person that particular body part belongs to
|
||||
`Result` object can also be generated as smoothened time-based interpolation from last known `Result` using `human.next()` method
|
||||
|
||||
|
||||
- [**Result Interface Specification**](https://vladmandic.github.io/human/typedoc/interfaces/Result.html)
|
||||
- [**Sample Result JSON**](../assets/sample-result.json)
|
||||
- [**Sample Persons JSON**](../assets/sample-persons.json)
|
||||
|
||||
<br>
|
||||
|
||||
Simplified documentation of `Result` object type:
|
||||
|
||||
```js
|
||||
result: Result = {
|
||||
timestamp: // timestamp in miliseconds when detection occured
|
||||
canvas: // optional processed canvas
|
||||
face: // <array of detected objects>
|
||||
[
|
||||
{
|
||||
id, // <number> face id number
|
||||
score, // <number> overall detection score returns faceScore if exists, otherwise boxScore
|
||||
faceScore // <number> confidence score in detection box after running mesh
|
||||
boxScore // <number> confidence score in detection box before running mesh
|
||||
box, // <array [x, y, width, height]>, clamped and normalized to input image size
|
||||
boxRaw, // <array [x, y, width, height]>, unclamped and normalized to range of 0..1
|
||||
mesh, // <array of 3D points [x, y, z]> 468 base points & 10 iris points, normalized to input impact size
|
||||
meshRaw, // <array of 3D points [x, y, z]> 468 base points & 10 iris points, normalized to range of 0..1
|
||||
annotations, // <list of object { landmark: array of points }> 32 base annotated landmarks & 2 iris annotations
|
||||
age, // <number> estimated age
|
||||
gender, // <string> 'male', 'female'
|
||||
genderScore // <number> confidence score in gender detection
|
||||
embedding, // <array>[float] vector of number values used for face similarity compare
|
||||
iris, // <number> relative distance of iris to camera, multiple by focal lenght to get actual distance
|
||||
emotion: // <array of emotions> returns multiple possible emotions for a given face, each with probability
|
||||
[
|
||||
{
|
||||
score, // <number> probabily of emotion
|
||||
emotion, // <string> 'angry', 'disgust', 'fear', 'happy', 'sad', 'surprise', 'neutral'
|
||||
}
|
||||
],
|
||||
rotation: {
|
||||
angle: // 3d face rotation values in radians in range of -pi/2 to pi/2 which is -90 to +90 degrees
|
||||
{
|
||||
roll, // roll is face lean left/right, value of 0 means center
|
||||
yaw, // yaw is face turn left/right, value of 0 means center
|
||||
pitch, // pitch is face move up/down, value of 0 means center
|
||||
}
|
||||
matrix: [] // flat array of [3,3] that can be directly used for GL matrix rotations such as in Three.js
|
||||
gaze:
|
||||
{
|
||||
bearing, // direction of gaze in radians
|
||||
strength, // strength of a gaze at a direction of angle
|
||||
},
|
||||
}
|
||||
tensor: // if config.face.detector.return is set to true, detector will return
|
||||
// a raw tensor containing cropped image of a face
|
||||
// note that tensors must be explicitly disposed to free memory by calling tf.dispose(tensor);
|
||||
}
|
||||
],
|
||||
body: // <array of detected objects>
|
||||
[
|
||||
{
|
||||
id, // body id number
|
||||
score, // <number>, overal detection score, only used for 'posenet', not used for 'blazepose'
|
||||
keypoints, // for 'posenet': <array of 2D landmarks [ score, landmark, position [x, y] ]> 17 annotated landmarks
|
||||
// for 'blazepose': <array of 2D landmarks [ score, landmark, position [x, y, z], presence ]>
|
||||
// 39 annotated landmarks for full or 31 annotated landmarks for upper
|
||||
// presence denotes probability value in range 0..1 that the point is located within the frame
|
||||
box, // <array [x, y, width, height]>, clamped and normalized to input image size
|
||||
boxRaw, // <array [x, y, width, height]>, unclamped and normalized to range of 0..1
|
||||
}
|
||||
],
|
||||
hand: // <array of detected objects>
|
||||
[
|
||||
{
|
||||
id, // hand id number
|
||||
score, // <number>, overal detection confidence score
|
||||
box, // <array [x, y, width, height]>, clamped and normalized to input image size
|
||||
boxRaw, // <array [x, y, width, height]>, unclamped and normalized to range of 0..1
|
||||
keypoints, // <array of 3D points [x, y, z]> 21 points
|
||||
annotations, // <object containing annotated 3D landmarks for each finger> 6 fingers
|
||||
landmarks, // <object containing logical curl and direction values for each finger> 5 fingers
|
||||
}
|
||||
],
|
||||
object: // <array of detected objects>
|
||||
[
|
||||
{
|
||||
score, // <number>
|
||||
class, // <number> class id based on coco labels
|
||||
label, // <string> class label based on coco labels
|
||||
box, // <array [x, y, width, height]>, clamped and normalized to input image size
|
||||
boxRaw, // <array [x, y, width, height]>, unclamped and normalized to range of 0..1
|
||||
}
|
||||
],
|
||||
gesture: // <array of objects object>
|
||||
[
|
||||
gesture-type: { // type of a gesture, can be face, iris, body, hand
|
||||
id, // <number>
|
||||
gesture, // <string>
|
||||
}
|
||||
],
|
||||
performance: { // performance data of last execution for each module measuredin miliseconds
|
||||
// note that per-model performance data is not available in async execution mode
|
||||
frames, // total number of frames processed
|
||||
cached, // total number of frames where some cached values were used
|
||||
backend, // time to initialize tf backend, keeps longest value measured
|
||||
load, // time to load models, keeps longest value measured
|
||||
image, // time for image processing
|
||||
gesture, // gesture analysis time
|
||||
body, // model time
|
||||
hand, // model time
|
||||
face, // model time
|
||||
agegender, // model time
|
||||
emotion, // model time
|
||||
change, // frame change detection time
|
||||
total, // end to end time
|
||||
},
|
||||
persons: // virtual object that is calculated on-demand by reading it
|
||||
// it unifies face, body, hands and gestures belonging to a same person under single object
|
||||
[
|
||||
id, // <number>
|
||||
face, // face object
|
||||
body, // body object if found
|
||||
hands: {
|
||||
left, // left hand object if found
|
||||
right, // right hand object if found
|
||||
}
|
||||
gestures: [] // array of gestures
|
||||
]
|
||||
}
|
||||
```
|
|
@ -0,0 +1,42 @@
|
|||
# Outputs
|
||||
|
||||
Result of `humand.detect()` method is a single object that includes data for all enabled modules and all detected objects
|
||||
|
||||
- `persons` property is a special getter which when invokes sorts results according to person that particular body part belongs to
|
||||
- `performance` property is set of performance counters used to monitor `Human` performance
|
||||
- `canvas` property is optional property that returns input after processing, as suitable for screen draw
|
||||
|
||||
`Result` object can also be generated as smoothened time-based interpolation from last known `Result` using `human.next()` method
|
||||
|
||||
<br>
|
||||
|
||||
Full documentation:
|
||||
- [**Result Interface Specification**](https://vladmandic.github.io/human/typedoc/interfaces/Result.html)
|
||||
- [**Result Interface Definition**](https://github.com/vladmandic/human/blob/main/src/result.ts)
|
||||
|
||||
<br>
|
||||
|
||||
Overview of `Result` object type:
|
||||
|
||||
```ts
|
||||
interface Result {
|
||||
/** {@link FaceResult}: detection & analysis results */
|
||||
face: Array<FaceResult>,
|
||||
/** {@link BodyResult}: detection & analysis results */
|
||||
body: Array<BodyResult>,
|
||||
/** {@link HandResult}: detection & analysis results */
|
||||
hand: Array<HandResult>,
|
||||
/** {@link GestureResult}: detection & analysis results */
|
||||
gesture: Array<GestureResult>,
|
||||
/** {@link ObjectResult}: detection & analysis results */
|
||||
object: Array<ObjectResult>
|
||||
/** global performance object with timing values for each operation */
|
||||
performance: Record<string, number>,
|
||||
/** optional processed canvas that can be used to draw input on screen */
|
||||
canvas?: OffscreenCanvas | HTMLCanvasElement | null | undefined,
|
||||
/** timestamp of detection representing the milliseconds elapsed since the UNIX epoch */
|
||||
readonly timestamp: number,
|
||||
/** getter property that returns unified persons object */
|
||||
persons: Array<PersonResult>,
|
||||
}
|
||||
```
|
Loading…
Reference in New Issue