Fix bug checking the length of binary types#263
Fix bug checking the length of binary types#263p32blo wants to merge 4 commits intoBergmann89:masterfrom
Conversation
Bergmann89
left a comment
There was a problem hiding this comment.
Hey @p32blo,
thanks for your contribution. The fix looks good in principle, but I’d
suggest a slightly different approach. Handling special constraints
directly in the generator shifts complexity to the wrong place. In my
opinion, it’s better to let the type itself handle this.
I would introduce the following types (in xsd-parser-types::xml):
pub struct HexString(pub String);
impl HexString {
/// Returns the byte length of the decoded data, and not the length of the string.
pub fn len(&self) -> usize { ... }
pub fn as_str(&self) -> &str { ... }
}
pub struct Base64String(pub String);
impl Base64String {
/// Returns the byte length of the decoded data, and not the length of the string.
pub fn len(&self) -> usize { ... }
pub fn as_str(&self) -> &str { ... }
}
pub struct HexBinary(pub Vec<u8>);
impl Deref for HexBinary { ... }
pub struct Base64Binary(pub Vec<u8>);
impl Deref for Base64Binary { ... }By default, HexString and Base64String would represent
xs:hexBinary and xs:base64Binary. This keeps compatibility with the
current string-based approach and avoids decoding unless necessary. If
a user needs the decoded form, they can switch to the binary types.
Providing suitable helpers in the config (see
Config::with_qname_type and Config::with_qname_type_from) would let
users choose the type they want.
This approach should not require changes to the interpreter or
generator, keeping the codebase cleaner while still solving the general
problem.
d4f8a57 to
32f1f3c
Compare
When we have a restriction on a binary type to check for its length, the size that needs to be considered is the octet size. So for a `xs:hexBinary` that might be represented by a string of 4 caracteres it actually only has length 2. The same for `xs:base64Binary` but the computation of the length is not as straight forward. Here is the relevant part of the standard: **lenght** https://www.w3.org/TR/xmlschema11-2/#rf-length **hexBinary** https://www.w3.org/TR/xmlschema11-2/#hexBinary **base64Binary** https://www.w3.org/TR/xmlschema11-2/#base64Binary
|
Thanks for the feedback. I tried to give it a go :) I was not able to figure out how to make the |
| }; | ||
|
|
||
| #[cfg(any(feature = "quick-xml", feature = "serde"))] | ||
| use base64::{engine::general_purpose, Engine as _}; |
There was a problem hiding this comment.
This should be only used if the base64 feature flag is set (to reduce unneeded dependencies).
#[cfg(all(feature = "base64", any(feature = "quick-xml", feature = "serde")))]
use base64::{engine::general_purpose, Engine as _};
Of cause the different serialize and deserialize traits then also needs this feature flag check.
| } | ||
| } | ||
|
|
||
| impl Deref for Base64String { |
There was a problem hiding this comment.
I would also like to have an implementation for DerefMut.
| pub struct Base64Binary(pub Vec<u8>); | ||
|
|
||
| impl Deref for Base64Binary { | ||
| type Target = [u8]; |
There was a problem hiding this comment.
This should deref to Vec<u8> not [u8]. The vector might provide additional functionality, that the pure slice does not have.
And I would like to have an implementation for DerefMut here as well.
There was a problem hiding this comment.
I would rename this to base64.rs since it has the definitions for Base64String and Base64Binary.
Same for hex.rs.
| pub struct HexBinary(pub Vec<u8>); | ||
|
|
||
| impl Deref for HexBinary { | ||
| type Target = [u8]; |
There was a problem hiding this comment.
Same as for base64: This should target Vec<u8> and both types should implement DerefMut.
| pub use self::base64binary::{Base64Binary as Base64BinaryBytes, Base64String}; | ||
| pub use self::hexbinary::{HexBinary, HexString}; | ||
|
|
||
| pub use self::any_simple_type::{AnySimpleType, Base64Binary, Decimal, Integer, Unsigned}; |
There was a problem hiding this comment.
You can replace the Base64Binary in the any_simple_type module with your new implementations (using Base64String as replacement for String and Base64Binary as replacement for Vec<u8>).
When we have a restriction on a binary type to check for its length, the size that needs to be considered is the octet size.
So for a
xs:hexBinarythat might be represented by a string of 4 caracteres it actually only has length 2.The same for
xs:base64Binarybut the computation of the length is not as straight forward.Here is the relevant part of the standard:
lenght - https://www.w3.org/TR/xmlschema11-2/#rf-length
hexBinary - https://www.w3.org/TR/xmlschema11-2/#hexBinary
base64Binary - https://www.w3.org/TR/xmlschema11-2/#base64Binary
This is my attempt to fix this, but feel free to change it completely if it does not make sense to do it this way :)