Nnedi3 resize16

From Avisynth wiki
Jump to: navigation, search
Author mawen1250
Version v3.3
Download nnedi3_resize16_v3.3.avsi
Category Resizers
Discussion NMM-HD Thread - [Chinese]


[edit] Description

nnedi3_resize16 is an advanced script for image resizing and colorspace conversion.

[edit] Requirements

[edit] Required Plugins

Latest version of the following plugins are recommended unless stated otherwise.

Optional script:

[edit] Syntax and Parameters

Avisynth icon.png nnedi3_resize16_v3.3.avsi

nnedi3_resize16 (clip input, int "target_width", int "target_height", float "src_left", float "src_top", float "src_width", float "src_height",
\ string "kernel_d", string "kernel_u", float "f_d", float "f_u", int "taps",
\ float "a1", float "a2", float "a3", bool "invks_d", bool "invks_u", int "invkstaps", bool "noring",
\ int "nsize", int "nns", int "qual", int "etype", int "pscrn", int "threads",
\ float "ratiothr", bool "mixed", float "thr", float "elast", float "sharp",
\ string "output", bool "tv_range", string "cplace", string "matrix", string "curve", float "gcor",
\ int "Y", int "U", int "V", bool "lsb_in", bool "lsb", int "dither")

[edit] Input clip and resizing parameters

 clip  input =
Input clip to be processed.

 int  target_width =
Target width; default value is the width of the input clip.

 int  target_height =
Target height; default value is the height of the input clip.

 float  src_left = 0.0
 float  src_top = 0.0
Coordinate of the top-left corner of the picture sub-area used as source for the resizing. They can be fractional. If negative, the picture is extended by replicating the left pixel column.
 float  src_width = width(input)
 float  src_height = height(input)
Size in pixels of the sub-area to resize. They can be fractional. If 0, the area has the same size as the source clip. If negative, they define coordinates relative to the bottom-right corner, in a Crop-like manner.
Default value is the width and height of the input clip.

  • Just like AviSynth's resizers you can use an expanded syntax which crops before resizing. The same operations are performed as if you crop just before resizing, there can be a slight beneficial speed difference.
    Note the edge semantics are slightly different, cropping gives a hard absolute boundary, the resizer filter lobes can extend into the cropped region but not beyond the physical edge of the image.
    Use crop to remove any hard borders or any other unwanted noise, using the resizer cropping may propagate the noise into the adjacent output pixels.
    Use the resizer cropping to maintain accurate edge rendering when cropping a part of a complete image.

[edit] Scaling Ratio Calculation

 float  ratiothr = 1.125
  • When scale ratio is larger than ratiothr, use nnedi3+Dither_resize16 upscale method instead of pure Dither_resize16.
  • When horizontal/vertical scale ratio > "ratiothr", we assume it's upscaling
  • When horizontal/vertical scale ratio <= "ratiothr", we assume it's downscaling

[edit] Parameters for merging edge and flat upscaled clip

 bool  mixed = true
nnedi3_resize16 uses nnedi3+Dither_resize16 for edge upscaling, this parameter defines whether to combine nnedi3+Dither_resize16(edge area) and Dither_resize16(flat area) in upscaling, which achieves higher precision upscaling result(mainly for flat area).

 float  thr = 1.0
The same with "thr" in Dither_limit_dif16, valid value range is (0, 10.0].
Threshold between reference data and filtered data.

 float  elast = 1.5
The same with "elast" in Dither_limit_dif16, valid value range is [1, 10.0].
To avoid artifacts, the threshold has some kind of elasticity. Value differences falling over this threshold are gradually attenuated, up to thr * elast. > 1.

  • PDiff: pixel value diff between flat clip and edge clip (edge clip as reference)
  • ODiff: pixel value diff between merged clip and edge clip (edge clip as reference)
  • PDiff, thr and elast is used to calculate ODiff:
  • ODiff = PDiff when [PDiff <= thr]
  • ODiff gradually smooths from thr to 0 when [thr <= PDiff <= thr * elast]
  • for elast>2.0, ODiff reaches maximum when [PDiff == thr * elast / 2]
  • ODiff = 0 when [PDiff >= thr * elast]
  • Larger "thr" will result in more pixels being taken from flat area upscaled clip (Dither_resize16)
  • Larger "thr" will result in less pixels being taken from edge area upscaled clip (nnedi3+Dither_resize16)
  • Larger "elast" will result in more pixels being blended from edge&flat area upscaled clip, for smoother merging

[edit] Parameters for nnedi3

 int  nsize = 0
Sets the size of the local neighborhood around each pixel that is used by the predictor neural network.
Possible settings (x_diameter x y_diameter):
  • 0 - 8x6
  • 1 - 16x6
  • 2 - 32x6
  • 3 - 48x6
  • 4 - 8x4
  • 5 - 16x4
  • 6 - 32x4
For image enlargement it is recommended to use 0 or 4. Larger y_diameter settings will result in sharper output.
For deinterlacing larger x_diameter settings will allow connecting lines of smaller slope. However, what setting to use really depends on the amount of aliasing (lost information) in the source.
If the source was heavily low-pass filtered before interlacing then aliasing will be low and a large x_diameter setting wont be needed, and vice versa.

 int  nns = 3
Sets the number of neurons in the predictor neural network. Possible settings are 0, 1, 2, 3, and 4. 0 is fastest. 4 is slowest, but should give the best quality.
This is a quality vs speed option; however, differences are usually small. The difference in speed will become larger as 'qual' is increased.
  • 0 - 16
  • 1 - 32
  • 2 - 64
  • 3 - 128
  • 4 - 256

 int  qual = 1
Controls the number of different neural network predictions that are blended together to compute the final output value.
Each neural network was trained on a different set of training data. Blending the results of these different networks improves generalization to unseen data.
Possible values are 1 or 2. Essentially this is a quality vs speed option. Larger values will result in more processing time, but should give better results.
However, the difference is usually pretty small. I would recommend using qual>1 for things like single image enlargement.

 int  etype = 0
Controls which set of weights to use in the predictor nn. Possible settings:
  • 0 - weights trained to minimize absolute error
  • 1 - weights trained to minimize squared error

 int  pscrn = 2
Controls whether or not the prescreener neural network is used to decide which pixels should be processed by the predictor neural network and which can be handled by simple cubic interpolation.
The prescreener is trained to know whether cubic interpolation will be sufficient for a pixel or whether it should be predicted by the predictor nn. The computational complexity of the prescreener nn is much less than that of the predictor nn.
Since most pixels can be handled by cubic interpolation, using the prescreener generally results in much faster processing. The prescreener is pretty accurate, so the difference between using it and not using it is almost always unnoticeable.
Version 0.9.3 adds a new, faster prescreener with three selectable 'levels', which trade off the number of pixels detected as only requiring cubic interpolation versus incurred error.
Therefore, pscrn is now an integer with possible values of 0, 1, 2, 3, and 4.
  • 0 - no prescreening (same as false in prior versions)
  • 1 - original prescreener (same as true in prior versions)
  • 2 - new prescreener level 0
  • 3 - new prescreener level 1
  • 4 - new prescreener level 2
Higher levels for the new prescreener result in cubic interpolation being used on fewer pixels (so are slower, but incur less error). However, the difference is pretty much unnoticeable.
Level 2 is closest to the original prescreener in terms of incurred error, but is much faster.

 int  threads = 0
Controls how many threads will be used for processing. If set to 0, threads will be set equal to the number of detected processors.

[edit] Parameters for Dither_resize16

 string  kernel_d = "Spline36Resize"
 string  kernel_u = "Spline64Resize"
"kernelh","kernelv" of Dither_resize16; kernel_d is used in downscaling and kernel_u is used in upscaling.
Kernel used by the resizer. Possible values are:
"point" Nearest neighbor interpolation. Same as PointResize().
"rect" or "box" Box filter.
"linear" or "bilinear" Bilinear interpolation. Same as BilinearResize().
"cubic" or "bicubic" Bicubic interpolation. Same as BicubicResize(). The b and c variables are mapped on a1 and a2 and are both set to 1/3 by default.
"lanczos" Sinc function windowed by the central lobe of a sinc. Use taps to specify its impulse length. Same as LanczosResize().
"blackman" Blackman-Harris windowed sinc. Use taps to control its length. Same as BlackmanResize().
"blackmanminlobe" Another kind of Blackman windowed sinc, with a bit less ringing. Use taps for you know what.
"spline16" Cubic spline based kernel, 4 sample points. Same as Spline16Resize().
"spline36" Spline, 6 sample points. Same as Spline36Resize().
"spline64" Spline, 8 sample points. Same as Spline64Resize().
"spline" Generic splines, number of sample points is twice the taps parameter, so you can use taps = 6 to get a Spline144Resize() equivalent.
"gauss" or "gaussian" Gaussian kernel. The p parameter is mapped on a1 and controls the curve width. The higher p, the sharper. It is set to 30 by default. This resizer is the same as GaussResize(), but taps offers a control on the filter impulse length. For low p values (soft and blurry), it’s better to increase the number of taps to avoid truncating the Gaussian curve too early and creating artifacts.
"sinc" Truncated sinc function. Use taps to control its length. Same as SincResize().
"impulse" Offers the possibility to create your own kernel (useful for convolutions). Add your coefficients in the string after “impulse”, separated with spaces (ex: "impulse 1 2 1"). The number of coefficients must be odd. The curve is linearly interpolated between the provided points. You can oversample the impulse by setting kovrspl to a value > 1.

 float  f_d = 1.0
 float  f_u = 1.0
"fh","fv" of Dither_resize16; f_d is for downscaling, f_u is for upscaling.
Horizontal and vertical frequency factors, also known as inverse kernel support. They are multipliers on the theoretical kernel cutoff frequency in both directions.
Values below 1.0 spatially expand the kernel and blur the picture. Values over 1.0 shrink the kernel and let higher frequencies pass. The result will look sharper but more aliased.
The multiplicator is applied after the kernel scaling in case of downsizing. Negative values force the processing, even if the horizontal size doesn’t change. The filter will use the absolute parameter value.

 int  taps = 4
"taps" of Dither_resize16.
Some kernels have a variable number of sample points, given by this parameter. Actually this counts half the number of lobes (or equivalent) ; in case of downscaling, the actual number of sample points may be greater than the specified value. Range: 1–128

 float  a1 =
 float  a2 =
 float  a3 =
Specific parameters, depending on the selected kernel.

 bool  invks_d = false
 bool  invks_u = false
"invksh","invksv" of Dither_resize16; invks_d is used in downscaling, invks_u is used in upscaling.
Activates the kernel inversion mode for the specified direction (use invks for both). Inverting the kernel allows to “undo” a previous upsizing by compensating the loss in high frequencies, giving a sharper and more accurate output than classic kernels, closer to the original. This is particularly useful for clips upscaled with a bilinear kernel. All the kernel-related parameters specify the kernel to undo. The target resolution must be as close as possible to the initial resolution. The kernel inversion is mainly intended to downsize an upscaled picture. Using it for upsizing will not restore details but will give a slightly sharper look, at the cost of a bit of aliasing and ringing. This mode is somewhat equivalent to the debilinear plug-in but works with a different principle.

 int  invkstaps = 5
In kernel inversion mode (invks=true), this parameter sets the number of taps for the inverted kernel. Use it as a tradeof between softness and ringing. Range: 1–128

 bool  noring = false
True use non-ringing algorithm of Dither_resize16 in flat area scaling
It actually doesn't make much sense for nnedi3_resize16(which uses nnedi3 for upscaling), while it may produce blurring and aliasing when downscaling. You'd better not setting it to True unless you know what you are doing.

[edit] Post-Process

 int  sharp = 0
Strength of Contra-Sharpen mod, for sharper edge. 0 means no sharpening, common value is about 100.
Only* when {horizontal or vertical}{scale ratio}>{ratiothr} will sharpening take effect (when nnedi3 is used for upscaling).

[edit] Input / Output

 int  Y = 3
 int  U = 3
 int  V = 3
Choose what planes to process; works just like MaskTools2.
  • 2 : copy from input clip
  • 3 : process

 bool  lsb_in = false
input clip is 16-bit stacked or not.

 bool  lsb = false
Output clip is 16-bit stacked or not, processing precision is 16-bit or not.

 bool  tv_range = true
Input clip is TV-range (16-235) or PC-range (0-255).

 int  dither =
Dither mode for 16-bit to 8-bit conversion. If tv_range=true, it defaults to 6, if false it defaults to 50.
Dithering method:
−1 no dither, round to the closest value
0 8-bit ordered dither + noise.
6 Serpentine Floyd-Steinberg error diffusion + noise. Well-balanced algorithm.
7 Stucki error diffusion + noise. Looks “sharp” and preserve light edges and details well.
8 Atkinson error diffusion + noise. Generates distinct patterns but keeps clean the flat areas.
Modes 1 to 5 have no real interest over mode 0 and can be considered deprecated.

 string  output =
Output format. Possible values are:
"Y8" Regular Y8 colorspace. Parameter "lsb" works on this output mode.
"YV12" Regular YV12 colorspace. Parameter "lsb" works on this output mode.
"YV16" Regular YV16 colorspace. Parameter "lsb" works on this output mode.
"YV24" Regular YV24 colorspace. Parameter "lsb" works on this output mode.
"RGB24" Regular RGB24 colorspace.
"RGB32" Regular RGB32 colorspace.
"RGB48YV12" 48-bit RGB conveyed on YV12. Use it for raw video export only. Not suitable for display or further processing (it will look like garbage).
"RGB48Y" 48-bit RGB. The components R, G and B are conveyed on three YV12 or Y8 (if supported) stack16 clips interleaved on a frame basis.
If output is not defined it will default to the colorspace of the input clip.

 string  cplace = "MPEG2"
Placement of the chroma subsamples. Can be one of these strings:
"MPEG1" 4:2:0 subsampling used in MPEG-1. Chroma samples are located on the center of each group of 4 pixels.
"MPEG2" Subsampling used in MPEG-2 4:2:x. Chroma samples are located on the left pixel column of the group.
 string  matrix =
The matrix used to convert the YUV pixels to computer RGB. Possible values are:
"601" ITU-R BT.601 / ITU-R BT.470-2 / SMPTE 170M. For Standard Definition content.
"709" ITU-R BT.709. For High Definition content.
"240" SMPTE 240M
"YCgCo" YCgCo
When the parameter is not defined, "601" and "709" are automatically selected depending on the clip definition. If either the width is greater than 1024 or the height is greater than 576 then matrix defaults to "709", if equal to or less than, it defaults to "601".

 string  curve = "linear"
Type of gamma mapping (transfer characteristic) for gamma-aware resize (only take effects for Dither_resize16 processing parts).
"709" ITU-R BT.709 transfer curve for digital video
"601" ITU-R BT.601 transfer curve, same as "709"
"170" SMPTE 170M, same as "709"
"240" SMPTE 240M (1987)
"srgb" sRGB curve
"2020" ITU-R BT.2020 transfer curve, for 12-bit content. For sources of lower bitdepth, use the "709" curve.
"linear" linear curve without gamma-aware processing

 float  gcor = 1.0
Gamma correction, applied on the linear part.

[edit] Examples

nnedi3_resize16 with default values (TODO):


Convert a 4:2:0 JPEG to RGB. Remember that most JPEGs use full range levels and the BT.601 color matrix. MPEG1 chroma placement is very common among 4:2:0 JPEGs while 4:2:2 JPEGs use the MPEG2 chroma placement.

nnedi3_resize16(output="RGB32", tv_range=false, cplace="MPEG1", matrix="601")

[edit] External Links

Back to External Filters

Personal tools