xerces-c can not deal with high GBK file
Kirby Zhou
kirbyzhou at sohu-rd.com
Mon Jul 26 10:43:39 UTC 2010
:-)
And if you decide to take ICU instead of IconvGNU in the xerces,
There seems another bug:
ICUTranscoder::transcodeFrom
495 UErrorCode err = U_ZERO_ERROR;
496 ucnv_toUnicode
497 (
498 fConverter
499 , &startTarget
500 , startTarget + maxChars
501 , (const char**)&startSrc
502 , (const char*)endSrc
503 , (fFixed ? 0 : (int32_t*)fSrcOffsets)
504 , false
505 , &err
506 );
There seems need a mutex to protect fConverter.
ICULCPTranscoder::calcRequiredSize called ' XMLMutexLock
lockConverter(&fMutex); ' to do it.
I do not known why the coder of xerces do not do the same thing here.
Regards,
Kirby Zhou
from SOHU-RD +86-10-6272-8261
-----Original Message-----
From: epel-devel-list-bounces at redhat.com
[mailto:epel-devel-list-bounces at redhat.com] On Behalf Of Stephen John
Smoogen
Sent: Saturday, July 24, 2010 1:21 AM
To: EPEL development disccusion
Subject: Re: xerces-c can not deal with high GBK file
Thanks for the bug report. will see what we can do with it.
On Fri, Jul 23, 2010 at 01:41, Kirby Zhou <kirbyzhou at sohu-rd.com> wrote:
> xerces-c-3.0.1/2.7.0 can not deal with high GBK file
>
> There is a bug inside util/Transcoders/IconvGNU/IconvGNUTransService.cpp.
>
> 1027 for (size_t cnt = 0; cnt < maxChars && srcLen; cnt++) {
> 1028 size_t rc = iconvFrom(startSrc, &srcLen, &orgTarget,
> uChSize());
> 1029 if (rc == (size_t)-1) {
> 1030 if (errno != E2BIG || prevSrcLen == srcLen) {
> 1031 ThrowXMLwithMemMgr(TranscodingException,
> XMLExcepts::Trans_BadSrcSeq, getMemoryManager());
> 1032 }
> 1033 }
> 1034 charSizes[cnt] = prevSrcLen - srcLen;
> 1035 prevSrcLen = srcLen;
> 1036 bytesEaten += charSizes[cnt];
> 1037 startSrc = endSrc - srcLen;
> 1038 toReturn++;
> 1039 }
>
> If a huge file is passed to xerces, partial text will be passed to
> IconvGNUTranscoder, and an incomplete multibyte sequence will been
> encountered in the input.
> errno EINVAL is for that. But the errno of EINVAL is unchecked.
>
>
>
> Regards,
> Kirby Zhou
> from SOHU-RD +86-10-6272-8261
>
>
>
> _______________________________________________
> epel-devel-list mailing list
> epel-devel-list at redhat.com
> https://www.redhat.com/mailman/listinfo/epel-devel-list
>
--
Stephen J Smoogen.
The core skill of innovators is error recovery, not failure avoidance.
Randy Nelson, President of Pixar University.
"We have a strategic plan. It's called doing things.""
Herb Kelleher, founder Southwest Airlines
_______________________________________________
epel-devel-list mailing list
epel-devel-list at redhat.com
https://www.redhat.com/mailman/listinfo/epel-devel-list
More information about the epel-devel-list
mailing list