Detect encoding from ReadOnlySpan<byte>#204
Conversation
|
Thanks for the PR!
This is supported in .NET 8? So we could use Note, I will remove .NET 6 support first (#205) - update, done |
|
Close/reopen for new merge commit |
|
private static void WriteSpanToStream(MemoryStream stream, ReadOnlySpan<byte> buffer)
{
#if NETSTANDARD2_1_OR_GREATER || NETCOREAPP2_1_OR_GREATER
stream.Write(buffer);
#else
byte[] rent = ArrayPool<byte>.Shared.Rent(buffer.Length);
try
{
buffer.CopyTo(rent);
stream.Write(rent, 0, buffer.Length);
}
finally
{
ArrayPool<byte>.Shared.Return(rent);
}
#endif
}I also updated System.Memory package to resolve a version conflict between System.Memory from Microsoft.SourceLink.GitHub. |
Add an overload that receives
ReadOnlySpan<byte>instead ofbyte[], so callers can detect the encoding of aSpan<T>orReadOnlySpan<T>without copying to abyte[]:The existing
byte[]overloads forward to it. The other methods invoked fromDetectFromBytesnow takeReadOnlySpan<byte>and use slicing instead ofoffset/len.This also affects some related methods, such as
CharsetDetector.Feed,CharsetProber.HandleData.Most of the changes are just signature updates and slicing instead of passing an offset to methods.
As an implementation note, since .NET Standard 2.0 does not have a
MemoryStream.Write(ReadOnlySpan<byte>)method, the data is copied into an array buffer and then written to the stream. This may reduce performance slightly, but I think it is the best approach without using unsafe blocks or reflections.Also this may break some codes outside of UTF-unknown that overload
CharsetDetector.Feedor derived class ofCharsetProber, but I believe that migrating to new signature should not be that hard.