You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
libtdevnc/common/turbojpeg.h

256 lines
9.3 KiB

Add TurboVNC encoding support. TurboVNC is a variant of TightVNC that uses the same client/server protocol (RFB version 3.8t), and thus it is fully cross-compatible with TightVNC and TigerVNC (with one exception, which is noted below.) Both the TightVNC and TurboVNC encoders analyze each rectangle, pick out regions of solid color to send separately, and send the remaining subrectangles using mono, indexed color, JPEG, or raw encoding, depending on the number of colors in the subrectangle. However, TurboVNC uses a fundamentally different selection algorithm to determine the appropriate subencoding to use for each subrectangle. Thus, while it sends a protocol stream that can be decoded by any TightVNC-compatible viewer, the mix of subencoding types in this protocol stream will be different from those generated by a TightVNC server. The research that led to TurboVNC is described in the following report: http://www.virtualgl.org/pmwiki/uploads/About/tighttoturbo.pdf. In summary: 20 RFB captures, representing "common" 2D and 3D application workloads (the 3D workloads were run using VirtualGL), were studied using the TightVNC encoder in isolation. Some of the analysis features in the TightVNC encoder, such as smoothness detection, were found to generate a lot of CPU usage with little or no benefit in compression, so those features were disabled. JPEG encoding was accelerated using libjpeg-turbo (which achieves a 2-4x speedup over plain libjpeg on modern x86 or ARM processors.) Finally, the "palette threshold" (minimum number of colors that the subrectangle must have before it is compressed using JPEG or raw) was adjusted to account for the fact that JPEG encoding is now quite a bit faster (meaning that we can now use it more without a CPU penalty.) TurboVNC has additional optimizations, such as the ability to count colors and encode JPEG images directly from the framebuffer without first translating the pixels into RGB. The TurboVNC encoder compares quite favorably in terms of compression ratio with TightVNC and generally encodes a great deal faster (often an order of magnitude or more.) The version of the TurboVNC encoder included in this patch is roughly equivalent to the one found in version 0.6 of the Unix TurboVNC Server, with a few minor patches integrated from TurboVNC 1.1. TurboVNC 1.0 added multi-threading capabilities, which can be added in later if desired (at the expense of making libvncserver depend on libpthread.) Because TurboVNC uses a fundamentally different mix of subencodings than TightVNC, because it uses the identical protocol (and thus a viewer really has no idea whether it's talking to a TightVNC or TurboVNC server), and because it doesn't support rfbTightPng (and in fact conflicts with it-- see below), the TurboVNC and TightVNC encoders cannot be enabled simultaneously. Compatibility: In *most* cases, a TurboVNC-enabled viewer is fully compatible with a TightVNC server, and vice versa. TurboVNC supports pseudo-encodings for specifying a fine-grained (1-100) quality scale and specifying chrominance subsampling. If a TurboVNC viewer sends those to a TightVNC server, then the TightVNC server ignores them, so the TurboVNC viewer also sends the quality on a 0-9 scale that the TightVNC server can understand. Similarly, the TurboVNC server checks first for fine-grained quality and subsampling pseudo-encodings from the viewer, and failing to receive those, it then checks for the TightVNC 0-9 quality pseudo-encoding. There is one case in which the two systems are not compatible, and that is when a TightVNC or TigerVNC viewer requests compression level 0 without JPEG from a TurboVNC server. For performance reasons, this causes the TurboVNC server to send images directly to the viewer, bypassing Zlib. When the TurboVNC server does this, it also sets bits 7-4 in the compression control byte to rfbTightNoZlib (0x0A), which is unfortunately the same value as rfbTightPng. Older TightVNC viewers that don't handle PNG will assume that the stream is uncompressed but still encapsulated in a Zlib structure, whereas newer PNG-supporting TightVNC viewers will assume that the stream is PNG. In either case, the viewer will probably crash. Since most VNC viewers don't expose compression level 0 in the GUI, this is a relatively rare situation. Description of changes: configure.ac -- Added support for libjpeg-turbo. If passed an argument of --with-turbovnc, configure will now run (or, if cross-compiling, just link) a test program that determines whether the libjpeg library being used is libjpeg-turbo. libjpeg-turbo must be used when building the TurboVNC encoder, because the TurboVNC encoder relies on the libjpeg-turbo colorspace extensions in order to compress images directly out of the framebuffer (which may be, for instance, BGRA rather than RGB.) libjpeg-turbo can optionally be used with the TightVNC encoder as well, but the speedup will only be marginal (the report linked above explains why in more detail, but basically it's because of Amdahl's Law. The TightVNC encoder was designed with the assumption that JPEG had a very high CPU cost, and thus JPEG is used only sparingly.) -- Added a new configure variable, JPEG_LDFLAGS. This is necessitated by the fact that libjpeg-turbo often distributes libjpeg.a and libjpeg.so in /opt/libjpeg-turbo/lib32 or /opt/libjpeg-turbo/lib64, and many people prefer to statically link with it. Thus, more flexibility is needed than is provided by --with-jpeg. If JPEG_LDFLAGS is specified, then it overrides the changes to LDFLAGS enacted by --with-jpeg (but --with-jpeg is still used to set the include path.) The addition of JPEG_LDFLAGS necessitated replacing AC_CHECK_LIB with AC_LINK_IFELSE (because AC_CHECK_LIB automatically sets LIBS to -ljpeg, which is not what we want if we're, for instance, linking statically with libjpeg-turbo.) -- configure does not check for PNG support if TurboVNC encoding is enabled. This prevents the rfbSendRectEncodingTightPng() function from being compiled in, since the TurboVNC encoder doesn't (and can't) support it. common/turbojpeg.c, common/turbojpeg.h -- TurboJPEG is a simple API used to compress and decompress JPEG images in memory. It was originally implemented because it was desirable to use different types of underlying technologies to compress JPEG on different platforms (mediaLib on SPARC, Quicktime on PPC Macs, Intel Performance Primitives, etc.) These days, however, libjpeg-turbo is the only underlying technology used by TurboVNC, so TurboJPEG's purpose is largely just code simplicity and flexibility. Thus, since there is no real need for libvncserver to use any technology other than libjpeg-turbo for compressing JPEG, the TurboJPEG wrapper for libjpeg-turbo has been included in-tree so that libvncserver can be directly linked with libjpeg-turbo. This is convenient because many modern Linux distros (Fedora, Ubuntu, etc.) now ship libjpeg-turbo as their default libjpeg library. libvncserver/rfbserver.c -- Added logic to check for the TurboVNC fine-grained quality level and subsampling encodings and to map Tight (0-9) quality levels to appropriate fine-grained quality level and subsampling values if communicating with a TightVNC/TigerVNC viewer. libvncserver/turbo.c -- TurboVNC encoder (compiled instead of libvncserver/tight.c) rfb/rfb.h -- Added support for the TurboVNC subsampling level rfb/rfbproto.h -- Added constants for the TurboVNC fine quality level and subsampling encodings as well as the rfbTightNoZlib constant and notes on its usage.
12 years ago
/* Copyright (C)2004 Landmark Graphics Corporation
* Copyright (C)2005, 2006 Sun Microsystems, Inc.
* Copyright (C)2009-2011 D. R. Commander
*
* This library is free software and may be redistributed and/or modified under
* the terms of the wxWindows Library License, Version 3.1 or (at your option)
* any later version. The full license is in the LICENSE.txt file included
* with this distribution.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* wxWindows Library License for more details.
*/
#if (defined(_MSC_VER) || defined(__CYGWIN__) || defined(__MINGW32__)) \
&& defined(_WIN32) && defined(DLLDEFINE)
#define DLLEXPORT __declspec(dllexport)
#else
#define DLLEXPORT
#endif
#define DLLCALL
/* Subsampling */
#define NUMSUBOPT 4
enum {TJ_444=0, TJ_422, TJ_420, TJ_GRAYSCALE};
#define TJ_411 TJ_420 /* for backward compatibility with VirtualGL <= 2.1.x,
TurboVNC <= 0.6, and TurboJPEG/IPP */
/* Flags */
#define TJ_BGR 1
/* The components of each pixel in the source/destination bitmap are stored
in B,G,R order, not R,G,B */
#define TJ_BOTTOMUP 2
/* The source/destination bitmap is stored in bottom-up (Windows, OpenGL)
order, not top-down (X11) order */
#define TJ_FORCEMMX 8
/* Turn off CPU auto-detection and force TurboJPEG to use MMX code
(IPP and 32-bit libjpeg-turbo versions only) */
#define TJ_FORCESSE 16
/* Turn off CPU auto-detection and force TurboJPEG to use SSE code
(32-bit IPP and 32-bit libjpeg-turbo versions only) */
#define TJ_FORCESSE2 32
/* Turn off CPU auto-detection and force TurboJPEG to use SSE2 code
(32-bit IPP and 32-bit libjpeg-turbo versions only) */
#define TJ_ALPHAFIRST 64
/* If the source/destination bitmap is 32 bpp, assume that each pixel is
ARGB/XRGB (or ABGR/XBGR if TJ_BGR is also specified) */
#define TJ_FORCESSE3 128
/* Turn off CPU auto-detection and force TurboJPEG to use SSE3 code
(64-bit IPP version only) */
#define TJ_FASTUPSAMPLE 256
/* Use fast, inaccurate 4:2:2 and 4:2:0 YUV upsampling routines
(libjpeg and libjpeg-turbo versions only) */
typedef void* tjhandle;
#define TJPAD(p) (((p)+3)&(~3))
#ifndef max
#define max(a,b) ((a)>(b)?(a):(b))
#endif
#ifdef __cplusplus
extern "C" {
#endif
/* API follows */
/*
tjhandle tjInitCompress(void)
Creates a new JPEG compressor instance, allocates memory for the structures,
and returns a handle to the instance. Most applications will only
need to call this once at the beginning of the program or once for each
concurrent thread. Don't try to create a new instance every time you
compress an image, because this may cause performance to suffer in some
TurboJPEG implementations.
RETURNS: NULL on error
*/
DLLEXPORT tjhandle DLLCALL tjInitCompress(void);
/*
int tjCompress(tjhandle j,
unsigned char *srcbuf, int width, int pitch, int height, int pixelsize,
unsigned char *dstbuf, unsigned long *size,
int jpegsubsamp, int jpegqual, int flags)
[INPUT] j = instance handle previously returned from a call to
tjInitCompress()
[INPUT] srcbuf = pointer to user-allocated image buffer containing RGB or
grayscale pixels to be compressed
[INPUT] width = width (in pixels) of the source image
[INPUT] pitch = bytes per line of the source image (width*pixelsize if the
bitmap is unpadded, else TJPAD(width*pixelsize) if each line of the bitmap
is padded to the nearest 32-bit boundary, such as is the case for Windows
bitmaps. You can also be clever and use this parameter to skip lines,
etc. Setting this parameter to 0 is the equivalent of setting it to
width*pixelsize.
[INPUT] height = height (in pixels) of the source image
[INPUT] pixelsize = size (in bytes) of each pixel in the source image
RGBX/BGRX/XRGB/XBGR: 4, RGB/BGR: 3, Grayscale: 1
[INPUT] dstbuf = pointer to user-allocated image buffer that will receive
the JPEG image. Use the TJBUFSIZE(width, height) function to determine
the appropriate size for this buffer based on the image width and height.
[OUTPUT] size = pointer to unsigned long that receives the size (in bytes)
of the compressed image
[INPUT] jpegsubsamp = Specifies either 4:2:0, 4:2:2, 4:4:4, or grayscale
subsampling. When the image is converted from the RGB to YCbCr colorspace
as part of the JPEG compression process, every other Cb and Cr
(chrominance) pixel can be discarded to produce a smaller image with
little perceptible loss of image clarity (the human eye is more sensitive
to small changes in brightness than small changes in color.)
TJ_420: 4:2:0 subsampling. Discards every other Cb, Cr pixel in both
horizontal and vertical directions
TJ_422: 4:2:2 subsampling. Discards every other Cb, Cr pixel only in
the horizontal direction
TJ_444: no subsampling
TJ_GRAYSCALE: Generate grayscale JPEG image
[INPUT] jpegqual = JPEG quality (an integer between 0 and 100 inclusive)
[INPUT] flags = the bitwise OR of one or more of the flags described in the
"Flags" section above
RETURNS: 0 on success, -1 on error
*/
DLLEXPORT int DLLCALL tjCompress(tjhandle j,
unsigned char *srcbuf, int width, int pitch, int height, int pixelsize,
unsigned char *dstbuf, unsigned long *size,
int jpegsubsamp, int jpegqual, int flags);
/*
unsigned long TJBUFSIZE(int width, int height)
Convenience function that returns the maximum size of the buffer required to
hold a JPEG image with the given width and height
RETURNS: -1 if arguments are out of bounds
*/
DLLEXPORT unsigned long DLLCALL TJBUFSIZE(int width, int height);
/*
tjhandle tjInitDecompress(void)
Creates a new JPEG decompressor instance, allocates memory for the
structures, and returns a handle to the instance. Most applications will
only need to call this once at the beginning of the program or once for each
concurrent thread. Don't try to create a new instance every time you
decompress an image, because this may cause performance to suffer in some
TurboJPEG implementations.
RETURNS: NULL on error
*/
DLLEXPORT tjhandle DLLCALL tjInitDecompress(void);
/*
int tjDecompressHeader2(tjhandle j,
unsigned char *srcbuf, unsigned long size,
int *width, int *height, int *jpegsubsamp)
[INPUT] j = instance handle previously returned from a call to
tjInitDecompress()
[INPUT] srcbuf = pointer to a user-allocated buffer containing a JPEG image
[INPUT] size = size of the JPEG image buffer (in bytes)
[OUTPUT] width = width (in pixels) of the JPEG image
[OUTPUT] height = height (in pixels) of the JPEG image
[OUTPUT] jpegsubsamp = type of chrominance subsampling used when compressing
the JPEG image
RETURNS: 0 on success, -1 on error
*/
DLLEXPORT int DLLCALL tjDecompressHeader2(tjhandle j,
unsigned char *srcbuf, unsigned long size,
int *width, int *height, int *jpegsubsamp);
/*
Legacy version of the above function
*/
DLLEXPORT int DLLCALL tjDecompressHeader(tjhandle j,
unsigned char *srcbuf, unsigned long size,
int *width, int *height);
/*
int tjDecompress(tjhandle j,
unsigned char *srcbuf, unsigned long size,
unsigned char *dstbuf, int width, int pitch, int height, int pixelsize,
int flags)
[INPUT] j = instance handle previously returned from a call to
tjInitDecompress()
[INPUT] srcbuf = pointer to a user-allocated buffer containing the JPEG image
to decompress
[INPUT] size = size of the JPEG image buffer (in bytes)
[INPUT] dstbuf = pointer to user-allocated image buffer that will receive
the bitmap image. This buffer should normally be pitch*height
bytes in size, although this pointer may also be used to decompress into
a specific region of a larger buffer.
[INPUT] width = width (in pixels) of the destination image
[INPUT] pitch = bytes per line of the destination image (width*pixelsize if
the bitmap is unpadded, else TJPAD(width*pixelsize) if each line of the
bitmap is padded to the nearest 32-bit boundary, such as is the case for
Windows bitmaps. You can also be clever and use this parameter to skip
lines, etc. Setting this parameter to 0 is the equivalent of setting it
to width*pixelsize.
[INPUT] height = height (in pixels) of the destination image
[INPUT] pixelsize = size (in bytes) of each pixel in the destination image
RGBX/BGRX/XRGB/XBGR: 4, RGB/BGR: 3, Grayscale: 1
[INPUT] flags = the bitwise OR of one or more of the flags described in the
"Flags" section above.
RETURNS: 0 on success, -1 on error
*/
DLLEXPORT int DLLCALL tjDecompress(tjhandle j,
unsigned char *srcbuf, unsigned long size,
unsigned char *dstbuf, int width, int pitch, int height, int pixelsize,
int flags);
/*
int tjDestroy(tjhandle h)
Frees structures associated with a compression or decompression instance
[INPUT] h = instance handle (returned from a previous call to
tjInitCompress() or tjInitDecompress()
RETURNS: 0 on success, -1 on error
*/
DLLEXPORT int DLLCALL tjDestroy(tjhandle h);
/*
char *tjGetErrorStr(void)
Returns a descriptive error message explaining why the last command failed
*/
DLLEXPORT char* DLLCALL tjGetErrorStr(void);
#ifdef __cplusplus
}
#endif