What are some of the drawbacks to using C-style strings?

C strings lack the following aspects of their C++ counterparts.

C strings lack the following aspects of their C++ counterparts: Automatic memory management: you have to allocate and free their memory manually. Extra capacity for concatenation efficiency: C++ strings often have a capacity greater than their size. This allows increasing the size without many reallocations.No embedded NULs: by definition a NUL character ends a C string; C++ string keep an internal size counter so they don't need a special value to mark their end.

Sensible comparison and assignment operators: even though comparison of C string pointers is permitted, it's almost always not what was intended. Similarly, assigning C string pointers (or passing them to functions) creates ownership ambiguities.

2 And the fact that many "obvious" string operations seem to compile, but do something completely different than expected (== compares the pointers to the strings, not the strings themselves. And + doesn't concatenate) – jalf Nov 23 '08 at 19:33 2 and of course, assignment doesn't do what you might expect either. :) The fundamental problem with C-style strings is that they just don't behave as strings.

– jalf Nov 23 '08 at 19:34 No reason why you couldn't have extra capacity on a c-string. Just allocate more than you need an put an early NUL. – Evan Teran Nov 23 '08 at 19:43 @Evan Teran: sure you could over-allocate, but then you'd need a separate variable to keep track of the capacity.

Std::basic_string has this built-in. – efotinis Nov 23 '08 at 21:39 @jalf: Nice one, I'm adding that too. Thanks!

– efotinis Nov 23 '08 at 21:40.

Not having the length accessible in constant-time is a serious overhead in many applications.

You could store the begin and end pointer, if it's an issue. – Jasper Bekkers Nov 23 '08 at 15:34 It is no longer "C-style strings" in that case, but a new kind of object. – bortzmeyer Nov 23 '08 at 17:34 I think he means store it in a temp variable before you use the string.

There's a popular example that Joel gave on his blog that talks about this issue, where he's using a for loop and getting the length of a string in the condition. This makes the loop O(n^2), when it could be O(n). – Bill the Lizard?

Nov 24 '08 at 0:38.

There are a few disadvantages to C strings: Getting the length is a relatively expensive operation. No embedded nul characters are allowed. The signed-ness of chars is implementation defined.

The character set is implementation defined. The size of the char type is implementation defined. Have to keep track separately of how each string is allocated and so how it must be free'd, or even if it needs to be free'd at all.No way to refer to a slice of the string as another string.

Strings are not immutable, meaning they must be synchronized separately. Strings cannot be manipulated at compile time. Switch cases cannot be strings.

The C preprocessor does not recognize strings in expressions. Cannot pass strings as template arguments (C++).

The memory management etc needed to grow string (char array), if necessary, is kinda boring to reinvent.

This is not the fault of C-style strings. An std::string implementation may use C-style strings (in fact, most use a combination of C-style and Pascal-style strings), and it grows and shrinks automatically. – strager Nov 23 '08 at 17:22 um, that was his point.

C++ hides the "boring" aspects of the memory management required around c-strings. – Evan Teran Nov 23 '08 at 19:42 "This is not the fault of C-style strings. " How is this not the fault of C-style strings?

– Max Lybbert Dec 4 '08 at 19:36.

You may know that today 1024 bytes is enough to contain any input, but you don't know how things will change tomorrow or next year. If premature optimization is the root of all evil, magic numbers are the stem.

There is no way to embed NUL characters (if you need them for something) into C style strings.

– quinmars Nov 23 '08 at 15:02 I haven't tried it, but I think it's possible with std::string. String::c_str() will return a character pointer to a C string with an embedded null char which any C-style code will interpret at the end of the string. – Ferruccio Nov 23 '08 at 15:31 Ah, yes true.

I was just a bit confused :). That you can have NULs in the middle of a string, doesn't exclude that the std::string saves an extra terminating NUL internally. – quinmars Nov 23 '08 at 15:44.

Well, to comment on your specific example, you don't know that the data returned by your call to df will fit into your buffer. Never trust un-sanatized input into your application, even when it is supposedly from a known source like df. For example, if a program named 'df' is placed somewhere in your search path so that it is executed instead of the system df it could be used to exploit your buffer limit.

Or if df is replaced by a malicious program. When reading input from a file use a function that lets you specify the maximum number of bytes to read. Under OSX and Linux fgets() is actually defined as char *fgets(char *s, int size, FILE *stream); so it would be safe to use on those systems.

Character encoding issues tend to surface when you have an array of bytes instead of a string of characters.

Unfortunately std::string does not help in this matter either, but there is of course wstring... – divideandconquer. Se Nov 23 '08 at 14:40 wstring also doesn't care about encoding unfortunately – Johannes Schaub - litb Nov 23 '08 at 14:48.

In your specific case, it's not the c-string that dangerous, so much as the reading an indeterminate amount of data into a fixed-size buffer. Don't ever use gets(char*) for example. Looking at your example though, it doesn't seem at all correct - try this: char buffer1024; char * line = NULL; while ((line = fgets(buffer, sizeof(buffer), fp))!

= NULL) { // parse one line of command output here. } This is a perfectly safe use of c-strings, although you'll have to deal with the possibility that line does not contain an entire line, but was rather truncated to 1023 characters (plus a null terminator).

Thanks. My example code wouldn't compile. I was more concerned with the char buffer issue and wrote the while loop (very lazily) from memory.

– Bill the Lizard? Nov 24 '08 at 1:01.

I think IT IS OKAY to use them, people've been using them for years. But I would rather use std::string if possible because 1) you don't have to be so cautious every time and can think about problems of your domain, instead of thinking that you need to add another parameter every time...memory management and that kinda stuff...it is just safer to code on a higher level... 2) there are probably some other small concerns which are not big deal but still...like people already mentioned...encoding, unicode...all those "related" kinda stuff people creating std::string thought of...:) Update I worked on a project for half a year. Somehow I was stupid enough to never compile in release mode before delivery....:) Well...luckily there was just one error I found after 3 hours.It was a very simple string buffer overrun.

Absolutely agree. If all Bill's trying to do is parse the output of a *nix command, C++ strings are thousands of times better for this. In fact, Stroustrup's got an example of something similar in one of his FAQ.

Perl would also shine for this kind of application. – Joe Pineda Nov 23 '08 at 18:42.

No Unicode support is reason enough these days...

C strings have opportunities for misuse, due to the fact that that one has to scan the string to determine where it ends. Strlen - to find the length, scan the string, until you hit the NUL, or access protected memory strcat - has to scan to find the NUL, in order to determine where to begin concatenating. There is no knowledge within a c string, to tell if there will be a buffer overrun or not.

C strings are risky, but generally faster than string objects.

Strncat can be used to prevent overruns. – SoapBox Nov 23 '08 at 15:02 A "string object" may be implemented exactly as a C-string. I'm sure the OP is looking at the concept of C-style strings and not their actual use in C.

– strager Nov 23 '08 at 17:26 @strager: the concept of C-style strings is their actual use in C. – Max Lybbert Dec 4 '08 at 19:40.

Imho, the hardest point of cstrings is the memory management, because you need to be carefully if you need to pass a copy of a cstring or if you can pass a literal to a function, ie. Will the function free the passed string or will it keep a reference longer then for the function call. The same applies to cstring return values.So without big effort it is not possible to share cstring copys.

This ends in many cases with unnecessary copiess of the same cstring in the memory.

This question is not really have an answer. If you writing in C what over options you have? If you writing in C++ why are you asking?

What is the reason not to use C++ primitives? The only reason I can think is: Linking C and C++ code and have char * somewhere in interfaces. It sometimes just easy to use char * instead doing conversion back and forward all the time (especially if it's really 'good' C++ code that have 3 different C++ string objects types).

If you write in C, you can always declare your own type as a struct, with all the operations (length_of, etc) you need and your own conventions (for instance that the encoding is UTF-32). But C does not make it very convenient. – bortzmeyer Nov 23 '08 at 17:37 1 Actually you are right :) bstring.sourceforge.net.

I was about to stay nobody do this, but decided to search a little bit first. Wise decision it was :) – Ilya Nov 23 '08 at 18:47.

C strings, like many other aspects of C, give you plenty of room to hang yourself. They are simple and fast, but unsafe in the situation where assumptions such as the null terminator can be violated or input can overrun the buffer. To do them reliably you have to observe fairly hygenic coding practices.

There used to be a saying that the canonical definition of a high-level language was "anything with better string handling than C".

As the STL gets more mature, it seems like people will be increasingly more comfortable with with STL strings than with C-style strings.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

What are some of the drawbacks to using C-style strings?

Related Questions

What are the drawbacks of using linked servers in SQL Server?

Are there any drawbacks to using anonymous functions in JavaScript? E.g. memory use?

Any drawbacks of using Hibernate EntityManager (vs. Hibernate Core)?

Java - Best way to grab ALL Strings between two Strings? (regex?)?

Splitting strings in C and saving it inside an array of strings, but results as garbage?

Objective-C: Comparing normal strings and strings found in NSMutableArrays?