You'll never convince me that this operation is a performance bottleneck. The efficient way is to make good use of your time by using the standard C library.
You'll never convince me that this operation is a performance bottleneck. The efficient way is to make good use of your time by using the standard C library: static unsigned char gethex(const char *s, char **endptr) { assert(s); while (isspace(*s)) s++; assert(*s); return strtoul(s, endptr, 16); } unsigned char *convert(const char *s, int *length) { unsigned char *answer = malloc((strlen(s) + 1) / 3); unsigned char *p; for (p = answer; *s; p++) *p = gethex(s, (char **)&s); *length = p - answer; return answer; } Compiled and tested. Works on your example.
I chose this as the answer because it simply provided a working example. Thanks! – Gbps Jul 11 '10 at 0:51 2 OTOH, buffer overflow on "A B C D E F 1 2 3 4 5 6 7 8 9".
– Ben Voigt Jul 11 '10 at 1:08 4 Much simpler: for (i=0; i– R.. Jul 11 '10 at 4:54 @R: great point about strtoul---I didn't read the man page carefully enough. Feel free to edit. – Norman Ramsey Jul 11 '10 at 5:46.
Iterate through all the characters. If you have a hex digit, the number is (ch >= 'A')? (ch - 'A' + 10): (ch - '0').
Left shift your accumulator by four bits and add (or OR) in the new digit. If you have a space, and the previous character was not a space, then append your current accumulator value to the array and reset the accumulator back to zero.
1: This is probably the most straightforward and simple way to do it. – James McNellis Jul 10 '10 at 23:22 That's basically what I did, except for using switch instead of ternary test. Depending on compiler and processor architecture one or the other may be faster.
But you should also test every character is in range 0-9A-F, and it makes testing the same thing two times. – kriss Jul 10 '10 at 23:42 @kriss: It's all in the assumptions. You assume that there must be exactly two hex digits and one space between each value, mine allows omission of a leading zero or multiple spaces, but assumes that there are no other classes of characters in the string.
If you can't assume that, I'd probably choose to do validation separately, by testing if (sstrspn(s, " 0123456789ABCDEF")) /* error */; Sure, it's another pass on the string, but so much cleaner. Or avoid the second pass over the string by using isspace and isxdigit on each character, which uses a lookup table for speed. – Ben Voigt Jul 11 '10 at 0:19 Looping around switches is not really an issue, I do not really take it as a difference.
I choosed to assume there was exactly two hex char in input, because if you allow more than that you should also check range for values. And what about allowing negativer numbers, we would have to manage sign, etc. Switch is a kind of lookup table... (and another fast conversion method would be to really use one implemented as an array). – kriss Jul 11 '10 at 0:40 The problem specified that all inputs were unsigned.
The problem didn't specify that there would always be zeros padding to exactly two digits (e.g. All of these fit in a char: 0xA, 0x0A, 0x000A) or just one space, although these assumptions were true on the sample input. – Ben Voigt Jul 11 '10 at 1:23.
This answers the original question, which asked for a C++ solution. You can use an istringstream with the hex manipulator: std::string hex_chars("E8 48 D8 FF FF 8B 0D"); std::istringstream hex_chars_stream(hex_chars); std::vector bytes; unsigned int c; while (hex_chars_stream >> std::hex >> c) { bytes. Push_back(c); } Note that c must be an int (or long, or some other integer type), not a char; if it is a char (or unsigned char), the wrong >> overload will be called and individual characters will be extracted from the string, not hexadecimal integer strings.
Additional error checking to ensure that the extracted value fits within a char would be a good idea.
1 +1 and deleting my equivalent (but not as good) answer. – Billy ONeal Jul 10 '10 at 23:11 1 Because I cannot give two correct answers, I went ahead and upvoted this one, as this definitely is a great solution for C++ users! – Gbps Jul 11 '10 at 0:50.
For a pure C implementation I think you can persuade sscanf(3) to do what you what. I believe this should be portable (including the slightly dodgy type coercion to appease the compiler) so long as your input string is only ever going to contain two-character hex values. #include #include char hex = "E8 48 D8 FF FF 8B 0D"; char *p; int cnt = (strlen(hex) + 1) / 3; // Whether or not there's a trailing space unsigned char *result = (unsigned char *)malloc(cnt), *r; unsigned char c; for (p = hex, r = result; *p; p += 3) { if (sscanf(p, "%02X", (unsigned int *)&c)!
= 1) { break; // Didn't parse as expected } *r++ = c; }.
Declare c as unsigned int, otherwise you could overwrite other local variables (or worse yet, your return address). – Ben Voigt Jul 11 '10 at 0:26 But generally scanf is going to take longer to figure out the format code than my entire answer will, and the question did ask for an efficient way. – Ben Voigt Jul 11 '10 at 0:28 @Ben Voigt.
Yes but does efficient mean run-time or programmer-time? '-) Anyway thanks for pointing out that I should have made c an insigned int and coerced that into the result array. – bjg Jul 11 '10 at 1:09.
If you know the length of the string to be parsed beforehand (e.g. You are reading something from /proc) you can use sscanf with the 'hh' type modifier, which specifies that the next conversion is one of diouxX and the pointer to store it will be either signed char or unsigned char. // example: ipv6 address as seen in /proc/net/if_inet6: char myString = "fe80000000000000020c29fffe01bafb"; unsigned char addressBytes16; sscanf(myString, "%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx %02hhx%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx", &addressBytes0, &addressBytes1, &addressBytes2, &addressBytes3, &addressBytes4, &addressBytes5, &addressBytes6, &addressBytes7, &addressBytes8, &addressBytes9, &addressBytes10, addressBytes11,&addressBytes12, &addressBytes13, &addressBytes14, &addressBytes15); int i; for (i = 0; I.
The old C way, do it by hand ;-) (there is many shorter ways, but I'm not golfing, I'm going for run-time). Enum { NBBYTES = 7 }; char resNBBYTES+1; const char * c = "E8 48 D8 FF FF 8B 0D"; const char * p = c; int I = 0; for (i = 0; I #include int main(){ enum { NBBYTES = 7 }; char resNBBYTES; const char * c = "E8 48 D8 FF FF 8B 0D"; const char * p = c; int I = -1; resi = 0; char ch = ' '; while (ch && I = NBBYTES-1){ printf("parse error, throw exception\n"); exit(-1); } for (i = 0 ; I.
3 Are we allowed to say 'Ick! '? (If only because the code will 'throw exception' on the last loop, because there are only 6 spaces in the string, not 7 as the code requires.) – Jonathan Leffler Jul 10 '10 at 23:43 @Jonathan: not any more... I could also have added a space to input.
The old separators vs terminators debate. – kriss Jul 11 '10 at 0:43 your little fix doesn't help... *p! = ' ' on the terminating NUL and it doesn't matter what you logical-or that with.
– Ben Voigt Jul 11 '10 at 1:05 Opps, I did err again. You should like the new fix better :-) – kriss Jul 11 '10 at 1:15 Validity check is still flaky. – Ben Voigt Jul 11 '10 at 1:24.
I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.