Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subtile decoding: memory use reduction and perf improvements #1010

Merged
merged 29 commits into from
Sep 5, 2017
Merged
Changes from 1 commit
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
84bbb4a
opj_t1_allocate_buffers(): remove useless overflow checks
rouault Aug 21, 2017
0a25dce
opj_j2k_setup_encoder(): validate code block width/height
rouault Aug 21, 2017
aa71981
opj_compress: reorder checks related to code block dimensions, to avo…
rouault Aug 21, 2017
f9e9942
Sub-tile decoding: only allocate tile component buffer of the needed …
rouault Sep 1, 2017
eee5104
opj_dwt_decode_partial_tile(): avoid undefined behaviour in lifting o…
rouault Sep 1, 2017
c37e360
opj_tcd_init_tile(): fix typo on overflow detection condition (introd…
rouault Sep 1, 2017
d5153ba
Remove limitation that prevents from opening images bigger than 4 bil…
rouault Sep 1, 2017
d1299d9
Fix compiler warning in release mode
rouault Sep 1, 2017
008a12d
TCD: allow tile buffer to be greater than 4GB on 64 bit hosts (but nu…
rouault Sep 1, 2017
98b9310
Various changes to allow tile buffers of more than 4giga pixels
rouault Sep 1, 2017
5d07d46
opj_j2k_decode_tiles(): apply whole single tile image decoding optimi…
rouault Sep 1, 2017
0ae3cba
Allow several repeated calls to opj_set_decode_area() and opj_decode(…
rouault Sep 1, 2017
b2cc8f7
Optimize reading/write into sparse array
rouault Sep 1, 2017
1644665
opj_j2k_update_image_data(): avoid zero-ing the buffer if not needed
rouault Sep 1, 2017
82a43d8
Optimize opj_dwt_decode_partial_1() when cas == 0
rouault Sep 1, 2017
18635df
test_decode_area: accept user bounds in -strip_height mode
rouault Sep 1, 2017
ccac773
Tiny perf improvement in T1 stage for subtile decoding
rouault Sep 1, 2017
873004c
Sub-tile decoding: speed up vertical pass in IDWT5x3 by processing 4 …
rouault Sep 1, 2017
470f3ed
opj_dwt_decode_partial_1_parallel(): add SSE2 optimization
rouault Sep 1, 2017
ae19001
opj_tcd_dc_level_shift_decode(): optimize lossy case
rouault Sep 1, 2017
83b5a16
opj_dwt_decode_partial_97(): simplify/more efficient use of sparse ar…
rouault Sep 1, 2017
8a17be8
opj_v4dwt_decode_step2_sse(): loop unroll
rouault Sep 1, 2017
7017e67
sparse_array: optimizations for lossy case
rouault Sep 1, 2017
559d16e
opj_t1_decode_cblk(): move some code to codeblock processor for (theo…
rouault Sep 1, 2017
2c365fe
Replace error message 'Not enough memory for tile data' by 'Size of t…
rouault Sep 1, 2017
4c7effa
opj_t1_clbl_decode_processor(): use SSE2 in subtile decoding code pat…
rouault Sep 1, 2017
676d4c8
opj_j2k_update_image_data(): avoid allocating image buffer if we can …
rouault Sep 1, 2017
c1e0fba
opj_v4dwt_decode_step1_sse(): rework a bit to improve code generation
rouault Sep 1, 2017
579b893
Replace uses of size_t by OPJ_SIZE_T
rouault Sep 4, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Allow several repeated calls to opj_set_decode_area() and opj_decode(…
…) for single-tiled images

* Only works for single-tiled images --> will error out cleanly, as currently
  in other cases
* Save re-reading the codestream for the tile, and re-use code-blocks of the
  previous decoding pass.
* Future improvements might involve improving opj_decompress, and the image writing logic,
  to use this strategy.
rouault committed Sep 1, 2017
commit 0ae3cba3404674bbe2028ea9a801301a4c951b33
62 changes: 43 additions & 19 deletions src/lib/openjp2/j2k.c
Original file line number Diff line number Diff line change
@@ -9147,10 +9147,15 @@ OPJ_BOOL opj_j2k_set_decode_area(opj_j2k_t *p_j2k,
OPJ_BOOL ret;
OPJ_UINT32 it_comp;

if (p_j2k->m_cp.tw == 1 && p_j2k->m_cp.th == 1 &&
&p_j2k->m_cp.tcps[0].m_data != NULL) {
/* In the case of a single-tiled image whose codestream we have already */
/* ingested, go on */
}
/* Check if we are read the main header */
if (p_j2k->m_specific_param.m_decoder.m_state != J2K_STATE_TPHSOT) {
else if (p_j2k->m_specific_param.m_decoder.m_state != J2K_STATE_TPHSOT) {
opj_event_msg(p_manager, EVT_ERROR,
"Need to decode the main header before begin to decode the remaining codestream");
"Need to decode the main header before begin to decode the remaining codestream.\n");
return OPJ_FALSE;
}

@@ -10508,20 +10513,27 @@ static OPJ_BOOL opj_j2k_decode_tiles(opj_j2k_t *p_j2k,
}

for (;;) {
if (! opj_j2k_read_tile_header(p_j2k,
&l_current_tile_no,
NULL,
&l_tile_x0, &l_tile_y0,
&l_tile_x1, &l_tile_y1,
&l_nb_comps,
&l_go_on,
p_stream,
p_manager)) {
return OPJ_FALSE;
}
if (p_j2k->m_cp.tw == 1 && p_j2k->m_cp.th == 1 &&
p_j2k->m_cp.tcps[0].m_data != NULL) {
l_current_tile_no = 0;
p_j2k->m_current_tile_number = 0;
p_j2k->m_specific_param.m_decoder.m_state |= J2K_STATE_DATA;
} else {
if (! opj_j2k_read_tile_header(p_j2k,
&l_current_tile_no,
NULL,
&l_tile_x0, &l_tile_y0,
&l_tile_x1, &l_tile_y1,
&l_nb_comps,
&l_go_on,
p_stream,
p_manager)) {
return OPJ_FALSE;
}

if (! l_go_on) {
break;
if (! l_go_on) {
break;
}
}

if (! opj_j2k_decode_tile(p_j2k, l_current_tile_no, NULL, 0,
@@ -10538,7 +10550,16 @@ static OPJ_BOOL opj_j2k_decode_tiles(opj_j2k_t *p_j2k,
p_j2k->m_output_image)) {
return OPJ_FALSE;
}
opj_j2k_tcp_data_destroy(&p_j2k->m_cp.tcps[l_current_tile_no]);

if (p_j2k->m_cp.tw == 1 && p_j2k->m_cp.th == 1 &&
!(p_j2k->m_output_image->x0 == p_j2k->m_private_image->x0 &&
p_j2k->m_output_image->y0 == p_j2k->m_private_image->y0 &&
p_j2k->m_output_image->x1 == p_j2k->m_private_image->x1 &&
p_j2k->m_output_image->y1 == p_j2k->m_private_image->y1)) {
/* Keep current tcp data */
} else {
opj_j2k_tcp_data_destroy(&p_j2k->m_cp.tcps[l_current_tile_no]);
}

opj_event_msg(p_manager, EVT_INFO,
"Image data has been updated with tile %d.\n\n", l_current_tile_no + 1);
@@ -10738,9 +10759,11 @@ OPJ_BOOL opj_j2k_decode(opj_j2k_t * p_j2k,
}
}

p_j2k->m_output_image = opj_image_create0();
if (!(p_j2k->m_output_image)) {
return OPJ_FALSE;
if (p_j2k->m_output_image == NULL) {
p_j2k->m_output_image = opj_image_create0();
if (!(p_j2k->m_output_image)) {
return OPJ_FALSE;
}
}
opj_copy_image_header(p_image, p_j2k->m_output_image);

@@ -10760,6 +10783,7 @@ OPJ_BOOL opj_j2k_decode(opj_j2k_t * p_j2k,
for (compno = 0; compno < p_image->numcomps; compno++) {
p_image->comps[compno].resno_decoded =
p_j2k->m_output_image->comps[compno].resno_decoded;
opj_image_data_free(p_image->comps[compno].data);
p_image->comps[compno].data = p_j2k->m_output_image->comps[compno].data;
#if 0
char fn[256];
6 changes: 6 additions & 0 deletions src/lib/openjp2/openjpeg.h
Original file line number Diff line number Diff line change
@@ -1340,6 +1340,12 @@ OPJ_API OPJ_BOOL OPJ_CALLCONV opj_read_header(opj_stream_t *p_stream,
* that is to say at the highest resolution level, even if requesting the image at lower
* resolution levels.
*
* Generally opj_set_decode_area() should be followed by opj_decode(), and the
* codec cannot be re-used.
* In the particular case of an image made of a single tile, several sequences of
* calls to opoj_set_decode_area() and opj_decode() are allowed, and will bring
* performance improvements when reading an image by chunks.
*
* @param p_codec the jpeg2000 codec.
* @param p_image the decoded image previously setted by opj_read_header
* @param p_start_x the left position of the rectangle to decode (in image coordinates).
42 changes: 40 additions & 2 deletions src/lib/openjp2/t1.c
Original file line number Diff line number Diff line change
@@ -1668,6 +1668,11 @@ static void opj_t1_clbl_decode_processor(void* user_data, opj_tls_t* tls)
}
}

/* Both can be non NULL if for example decoding a full tile and then */
/* partially a tile. In which case partial decoding should be the */
/* priority */
assert((cblk->decoded_data != NULL) || (tilec->data != NULL));

if (cblk->decoded_data) {
if (tccp->qmfbid == 1) {
for (j = 0; j < cblk_h; ++j) {
@@ -1763,15 +1768,24 @@ void opj_t1_decode_cblks(opj_tcd_t* tcd,
(OPJ_UINT32)precinct->y0,
(OPJ_UINT32)precinct->x1,
(OPJ_UINT32)precinct->y1)) {
for (cblkno = 0; cblkno < precinct->cw * precinct->ch; ++cblkno) {
opj_tcd_cblk_dec_t* cblk = &precinct->cblks.dec[cblkno];
if (cblk->decoded_data) {
#ifdef DEBUG_VERBOSE
printf("Discarding codeblock %d,%d at resno=%d, bandno=%d\n",
cblk->x0, cblk->y0, resno, bandno);
#endif
opj_free(cblk->decoded_data);
cblk->decoded_data = NULL;
}
}
continue;
}

for (cblkno = 0; cblkno < precinct->cw * precinct->ch; ++cblkno) {
opj_tcd_cblk_dec_t* cblk = &precinct->cblks.dec[cblkno];
opj_t1_cblk_decode_processing_job_t* job;

assert(cblk->decoded_data == NULL);

if (!opj_tcd_is_subband_area_of_interest(tcd,
tilec->compno,
resno,
@@ -1780,15 +1794,34 @@ void opj_t1_decode_cblks(opj_tcd_t* tcd,
(OPJ_UINT32)cblk->y0,
(OPJ_UINT32)cblk->x1,
(OPJ_UINT32)cblk->y1)) {
if (cblk->decoded_data) {
#ifdef DEBUG_VERBOSE
printf("Discarding codeblock %d,%d at resno=%d, bandno=%d\n",
cblk->x0, cblk->y0, resno, bandno);
#endif
opj_free(cblk->decoded_data);
cblk->decoded_data = NULL;
}
continue;
}

if (!tcd->whole_tile_decoding) {
OPJ_UINT32 cblk_w = (OPJ_UINT32)(cblk->x1 - cblk->x0);
OPJ_UINT32 cblk_h = (OPJ_UINT32)(cblk->y1 - cblk->y0);
if (cblk->decoded_data != NULL) {
#ifdef DEBUG_VERBOSE
printf("Reusing codeblock %d,%d at resno=%d, bandno=%d\n",
cblk->x0, cblk->y0, resno, bandno);
#endif
continue;
}
if (cblk_w == 0 || cblk_h == 0) {
continue;
}
#ifdef DEBUG_VERBOSE
printf("Decoding codeblock %d,%d at resno=%d, bandno=%d\n",
cblk->x0, cblk->y0, resno, bandno);
#endif
/* Zero-init required */
cblk->decoded_data = opj_calloc(1, cblk_w * cblk_h * sizeof(OPJ_INT32));
if (cblk->decoded_data == NULL) {
@@ -1803,6 +1836,11 @@ void opj_t1_decode_cblks(opj_tcd_t* tcd,
*pret = OPJ_FALSE;
return;
}
} else if (cblk->decoded_data) {
/* Not sure if that code path can happen, but better be */
/* safe than sorry */
opj_free(cblk->decoded_data);
cblk->decoded_data = NULL;
}

job = (opj_t1_cblk_decode_processing_job_t*) opj_calloc(1,
4 changes: 4 additions & 0 deletions tests/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -103,6 +103,10 @@ add_test(NAME tda_prep_irreversible_203_201_17_19_no_precinct COMMAND test_tile_
add_test(NAME tda_irreversible_203_201_17_19_no_precinct COMMAND test_decode_area -q irreversible_203_201_17_19_no_precinct.j2k)
set_property(TEST tda_irreversible_203_201_17_19_no_precinct APPEND PROPERTY DEPENDS tda_prep_irreversible_203_201_17_19_no_precinct)

add_test(NAME tda_prep_strip COMMAND test_tile_encoder 1 256 256 256 256 8 0 tda_single_tile.j2k)
add_test(NAME tda_strip COMMAND test_decode_area -q -strip_height 3 -strip_check tda_single_tile.j2k)
set_property(TEST tda_strip APPEND PROPERTY DEPENDS tda_prep_strip)

add_executable(include_openjpeg include_openjpeg.c)

# No image send to the dashboard if lib PNG is not available.
Loading