mirror of
				git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
				synced 2025-09-04 20:19:47 +08:00 
			
		
		
		
	
			
				
					
						
					
					58284a901b
				
			
			
		
	
	
		
			7 Commits
		
	
	
	| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|  Nick Terrell | b1a3e75e46 | lz4: fix kernel decompression speed This patch replaces all memcpy() calls with LZ4_memcpy() which calls __builtin_memcpy() so the compiler can inline it. LZ4 relies heavily on memcpy() with a constant size being inlined. In x86 and i386 pre-boot environments memcpy() cannot be inlined because memcpy() doesn't get defined as __builtin_memcpy(). An equivalent patch has been applied upstream so that the next import won't lose this change [1]. I've measured the kernel decompression speed using QEMU before and after this patch for the x86_64 and i386 architectures. The speed-up is about 10x as shown below. Code Arch Kernel Size Time Speed v5.8 x86_64 11504832 B 148 ms 79 MB/s patch x86_64 11503872 B 13 ms 885 MB/s v5.8 i386 9621216 B 91 ms 106 MB/s patch i386 9620224 B 10 ms 962 MB/s I also measured the time to decompress the initramfs on x86_64, i386, and arm. All three show the same decompression speed before and after, as expected. [1] https://github.com/lz4/lz4/pull/890 Signed-off-by: Nick Terrell <terrelln@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Yann Collet <yann.collet.73@gmail.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Sven Schmidt <4sschmid@informatik.uni-hamburg.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Arvind Sankar <nivedita@alum.mit.edu> Link: http://lkml.kernel.org/r/20200803194022.2966806-1-nickrterrell@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> | ||
|  Gao Xiang | 2209fda323 | lib/lz4: update LZ4 decompressor module Update the LZ4 compression module based on LZ4 v1.8.3 in order for the erofs file system to use the newest LZ4_decompress_safe_partial() which can now decode exactly the nb of bytes requested [1] to take place of the open hacked code in the erofs file system itself. Currently, apart from the erofs file system, no other users use LZ4_decompress_safe_partial, so no worry about the interface. In addition, LZ4 v1.8.x boosts up decompression speed compared to the current code which is based on LZ4 v1.7.3, mainly due to shortcut optimization for the specific common LZ4-sequences [2]. lzbench testdata (tested in kirin710, 8 cores, 4 big cores at 2189Mhz, 2GB DDR RAM at 1622Mhz, with enwik8 testdata [3]): Compressor name Compress. Decompress. Compr. size Ratio Filename memcpy 5004 MB/s 4924 MB/s 100000000 100.00 enwik8 lz4hc 1.7.3 -9 12 MB/s 653 MB/s 42203253 42.20 enwik8 lz4hc 1.8.0 -9 12 MB/s 908 MB/s 42203096 42.20 enwik8 lz4hc 1.8.3 -9 11 MB/s 965 MB/s 42203094 42.20 enwik8 [1] https://github.com/lz4/lz4/issues/566 | ||
|  Sven Schmidt | 4e1a33b105 | lib: update LZ4 compressor module Patch series "Update LZ4 compressor module", v7. This patchset updates the LZ4 compression module to a version based on LZ4 v1.7.3 allowing to use the fast compression algorithm aka LZ4 fast which provides an "acceleration" parameter as a tradeoff between high compression ratio and high compression speed. We want to use LZ4 fast in order to support compression in lustre and (mostly, based on that) investigate data reduction techniques in behalf of storage systems. Also, it will be useful for other users of LZ4 compression, as with LZ4 fast it is possible to enable applications to use fast and/or high compression depending on the usecase. For instance, ZRAM is offering a LZ4 backend and could benefit from an updated LZ4 in the kernel. LZ4 homepage: http://www.lz4.org/ LZ4 source repository: https://github.com/lz4/lz4 Source version: 1.7.3 Benchmark (taken from [1], Core i5-4300U @1.9GHz): ----------------|--------------|----------------|---------- Compressor | Compression | Decompression | Ratio ----------------|--------------|----------------|---------- memcpy | 4200 MB/s | 4200 MB/s | 1.000 LZ4 fast 50 | 1080 MB/s | 2650 MB/s | 1.375 LZ4 fast 17 | 680 MB/s | 2220 MB/s | 1.607 LZ4 fast 5 | 475 MB/s | 1920 MB/s | 1.886 LZ4 default | 385 MB/s | 1850 MB/s | 2.101 [1] http://fastcompression.blogspot.de/2015/04/sampling-or-faster-lz4.html [PATCH 1/5] lib: Update LZ4 compressor module [PATCH 2/5] lib/decompress_unlz4: Change module to work with new LZ4 module version [PATCH 3/5] crypto: Change LZ4 modules to work with new LZ4 module version [PATCH 4/5] fs/pstore: fs/squashfs: Change usage of LZ4 to work with new LZ4 version [PATCH 5/5] lib/lz4: Remove back-compat wrappers This patch (of 5): Update the LZ4 kernel module to LZ4 v1.7.3 by Yann Collet. The kernel module is inspired by the previous work by Chanho Min. The updated LZ4 module will not break existing code since the patchset contains appropriate changes. API changes: New method LZ4_compress_fast which differs from the variant available in kernel by the new acceleration parameter, allowing to trade compression ratio for more compression speed and vice versa. LZ4_decompress_fast is the respective decompression method, featuring a very fast decoder (multiple GB/s per core), able to reach RAM speed in multi-core systems. The decompressor allows to decompress data compressed with LZ4 fast as well as the LZ4 HC (high compression) algorithm. Also the useful functions LZ4_decompress_safe_partial and LZ4_compress_destsize were added. The latter reverses the logic by trying to compress as much data as possible from source to dest while the former aims to decompress partial blocks of data. A bunch of streaming functions were also added which allow compressig/decompressing data in multiple steps (so called "streaming mode"). The methods lz4_compress and lz4_decompress_unknownoutputsize are now known as LZ4_compress_default respectivley LZ4_decompress_safe. The old methods will be removed since there's no callers left in the code. [arnd@arndb.de: fix KERNEL_LZ4 support] Link: http://lkml.kernel.org/r/20170208211946.2839649-1-arnd@arndb.de [akpm@linux-foundation.org: simplify] [akpm@linux-foundation.org: fix the simplification] [4sschmid@informatik.uni-hamburg.de: fix performance regressions] Link: http://lkml.kernel.org/r/1486898178-17125-2-git-send-email-4sschmid@informatik.uni-hamburg.de [4sschmid@informatik.uni-hamburg.de: v8] Link: http://lkml.kernel.org/r/1487182598-15351-2-git-send-email-4sschmid@informatik.uni-hamburg.de Link: http://lkml.kernel.org/r/1486321748-19085-2-git-send-email-4sschmid@informatik.uni-hamburg.de Signed-off-by: Sven Schmidt <4sschmid@informatik.uni-hamburg.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Bongkyu Kim <bongkyu.kim@lge.com> Cc: Rui Salvaterra <rsalvaterra@gmail.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: David S. Miller <davem@davemloft.net> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Kees Cook <keescook@chromium.org> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> | ||
|  Rui Salvaterra | dea5c24a14 | lib: lz4: cleanup unaligned access efficiency detection These identifiers are bogus. The interested architectures should define HAVE_EFFICIENT_UNALIGNED_ACCESS whenever relevant to do so. If this isn't true for some arch, it should be fixed in the arch definition. Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> | ||
|  Rui Salvaterra | 3e26a691fe | lib: lz4: fixed zram with lz4 on big endian machines Based on Sergey's test patch [1], this fixes zram with lz4 compression on big endian cpus. Note that the 64-bit preprocessor test is not a cleanup, it's part of the fix, since those identifiers are bogus (for example, __ppc64__ isn't defined anywhere else in the kernel, which means we'd fall into the 32-bit definitions on ppc64). Tested on ppc64 with no regression on x86_64. [1] http://marc.info/?l=linux-kernel&m=145994470805853&w=4 Cc: stable@vger.kernel.org Suggested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> | ||
|  Chanho Min | c72ac7a1a9 | lib: add lz4 compressor module This patchset is for supporting LZ4 compression and the crypto API using
it.
As shown below, the size of data is a little bit bigger but compressing
speed is faster under the enabled unaligned memory access.  We can use
lz4 de/compression through crypto API as well.  Also, It will be useful
for another potential user of lz4 compression.
lz4 Compression Benchmark:
Compiler: ARM gcc 4.6.4
ARMv7, 1 GHz based board
   Kernel: linux 3.4
   Uncompressed data Size: 101 MB
         Compressed Size  compression Speed
   LZO   72.1MB		  32.1MB/s, 33.0MB/s(UA)
   LZ4   75.1MB		  30.4MB/s, 35.9MB/s(UA)
   LZ4HC 59.8MB		   2.4MB/s,  2.5MB/s(UA)
- UA: Unaligned memory Access support
- Latest patch set for LZO applied
This patch:
Add support for LZ4 compression in the Linux Kernel.  LZ4 Compression APIs
for kernel are based on LZ4 implementation by Yann Collet and were changed
for kernel coding style.
LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
LZ4 source repository : http://code.google.com/p/lz4/
svn revision : r90
Two APIs are added:
lz4_compress() support basic lz4 compression whereas lz4hc_compress()
support high compression or CPU performance get lower but compression
ratio get higher.  Also, we require the pre-allocated working memory with
the defined size and destination buffer must be allocated with the size of
lz4_compressbound.
[akpm@linux-foundation.org: make lz4_compresshcctx() static]
Signed-off-by: Chanho Min <chanho.min@lge.com>
Cc: "Darrick J. Wong" <djwong@us.ibm.com>
Cc: Bob Pearson <rpearson@systemfabricworks.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Herbert Xu <herbert@gondor.hengli.com.au>
Cc: Yann Collet <yann.collet.73@gmail.com>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> | ||
|  Kyungsik Lee | cffb78b0e0 | decompressor: add LZ4 decompressor module Add support for LZ4 decompression in the Linux Kernel.  LZ4 Decompression
APIs for kernel are based on LZ4 implementation by Yann Collet.
Benchmark Results(PATCH v3)
Compiler: Linaro ARM gcc 4.6.2
1. ARMv7, 1.5GHz based board
   Kernel: linux 3.4
   Uncompressed Kernel Size: 14MB
        Compressed Size  Decompression Speed
   LZO  6.7MB            20.1MB/s, 25.2MB/s(UA)
   LZ4  7.3MB            29.1MB/s, 45.6MB/s(UA)
2. ARMv7, 1.7GHz based board
   Kernel: linux 3.7
   Uncompressed Kernel Size: 14MB
        Compressed Size  Decompression Speed
   LZO  6.0MB            34.1MB/s, 52.2MB/s(UA)
   LZ4  6.5MB            86.7MB/s
- UA: Unaligned memory Access support
- Latest patch set for LZO applied
This patch set is for adding support for LZ4-compressed Kernel.  LZ4 is a
very fast lossless compression algorithm and it also features an extremely
fast decoder [1].
But we have five of decompressors already and one question which does
arise, however, is that of where do we stop adding new ones?  This issue
had been discussed and came to the conclusion [2].
Russell King said that we should have:
 - one decompressor which is the fastest
 - one decompressor for the highest compression ratio
 - one popular decompressor (eg conventional gzip)
If we have a replacement one for one of these, then it should do exactly
that: replace it.
The benchmark shows that an 8% increase in image size vs a 66% increase
in decompression speed compared to LZO(which has been known as the
fastest decompressor in the Kernel).  Therefore the "fast but may not be
small" compression title has clearly been taken by LZ4 [3].
[1] http://code.google.com/p/lz4/
[2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157
[3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347
LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html
LZ4 source repository: http://code.google.com/p/lz4/
Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Yann Collet <yann.collet.73@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Florian Fainelli <florian@openwrt.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |