Author | Message | Time |
---|---|---|
Skywing | Grok can't answer because I already explained about this to him. [code] #include <stdio.h> int main(int ac, char **av) { char ch = getc(stdin); printf("isspace(ch) returns %u.\n", isspace(ch)); return 0; }[/code] | December 1, 2003, 5:33 AM |
iago | will getc() actually pick up a whitespace? Or does it wait till the first character? If it doesn't pick up whitespaces, the printf() line is irrevelant because it'll never be a space. | December 1, 2003, 8:36 AM |
Skywing | [quote author=iago link=board=30;threadid=3985;start=0#msg32860 date=1070267804] will getc() actually pick up a whitespace? Or does it wait till the first character? If it doesn't pick up whitespaces, the printf() line is irrevelant because it'll never be a space. [/quote] getc returns the next character in the stream, whitespace or not. | December 1, 2003, 12:53 PM |
Adron | You'll probably also have to specify what the program is supposed to do... It looks like it will compile just fine. [quote][black]There's the issue of a possible EOF return from getc, and that you have to input your character, then hit enter with the evil stdio functions.[/black][/quote] | December 1, 2003, 5:31 PM |
Eibro | I know the answer from reading chat between Skywing and Kp, but I won't reveal. :) | December 1, 2003, 7:32 PM |
Skywing | [quote author=Adron link=board=30;threadid=3985;start=0#msg32908 date=1070299887] You'll probably also have to specify what the program is supposed to do... It looks like it will compile just fine. [quote][black]There's the issue of a possible EOF return from getc, and that you have to input your character, then hit enter with the evil stdio functions.[/black][/quote] [/quote] The error in question isn't a compile-time problem. This program demonstrates a real problem I had to fix in a different program of mine. | December 1, 2003, 7:34 PM |
St0rm.iD | Probably unicode chars will kill it, since I'll bet that sizeof(char) == 1. | December 1, 2003, 9:12 PM |
Skywing | [quote author=St0rm.iD link=board=30;threadid=3985;start=0#msg32958 date=1070313168] Probably unicode chars will kill it, since I'll bet that sizeof(char) == 1. [/quote] No, those would be automatically converted into a best-fit ASCII character given the active code page. | December 1, 2003, 9:18 PM |
iago | It looks like it should do exactly what it's supposed to do.. read in a single character and detects whether it's a whitespace? | December 1, 2003, 10:04 PM |
Etheran | your methods are outdated, sky :p | December 1, 2003, 10:20 PM |
Skywing | [quote author=iago link=board=30;threadid=3985;start=0#msg32979 date=1070316249] It looks like it should do exactly what it's supposed to do.. read in a single character and detects whether it's a whitespace? [/quote] However, there is a problem with it. Can you find the problem? Maybe we should have a C/C++ quiz question of the week :p | December 1, 2003, 10:34 PM |
iago | Is it because it doesn't return 0 for false and 1 for true, instead it returns 0 for false and non-zero for true? I decided to compile/run it, and for a space/tab/newline/etc it displayed 8192. Not that that's reallya problem, though, but iw oudl be better to do this: printf("isspace(ch) returns %s", isspace(ch) ? "true" : "false"); | December 1, 2003, 10:50 PM |
Skywing | [quote author=iago link=board=30;threadid=3985;start=0#msg33017 date=1070319022] Is it because it doesn't return 0 for false and 1 for true, instead it returns 0 for false and non-zero for true? I decided to compile/run it, and for a space/tab/newline/etc it displayed 8192. Not that that's reallya problem, though, but iw oudl be better to do this: printf("isspace(ch) returns %s", isspace(ch) ? "true" : "false"); [/quote] isspace returns a zero or a nonzero value according to the documentation. This isn't the problem... | December 1, 2003, 11:36 PM |
iago | Then the code works fine. Skywing is crazy. | December 1, 2003, 11:39 PM |
Skywing | [quote author=iago link=board=30;threadid=3985;start=0#msg33041 date=1070321957] Then the code works fine. Skywing is crazy. [/quote] Simply because it appears to work fine doesn't mean that nothing is wrong. | December 2, 2003, 12:05 AM |
iago | [quote author=Skywing link=board=30;threadid=3985;start=0#msg33050 date=1070323549] [quote author=iago link=board=30;threadid=3985;start=0#msg33041 date=1070321957] Then the code works fine. Skywing is crazy. [/quote] Simply because it appears to work fine doesn't mean that nothing is wrong. [/quote] It does, I'm afraid. So you lose. :) | December 2, 2003, 2:22 AM |
Grok | Skywing is correct. You're not thinking. | December 2, 2003, 2:24 AM |
DarkMinion | Maybe because you're outputting it as unsigned when it returns a signed integer? | December 2, 2003, 3:09 AM |
DarkMinion | Also, getc() returns a signed integer as well...could lead to certain problems | December 2, 2003, 3:10 AM |
Skywing | [quote author=DarkMinion link=board=30;threadid=3985;start=15#msg33099 date=1070334547] Maybe because you're outputting it as unsigned when it returns a signed integer? [/quote] That's not the problem in question. Outputting an int as unsigned isn't the cause of any breakage here. | December 2, 2003, 3:10 AM |
DarkMinion | Maybe it would help if you said what problem you were experiencing? Ballpark? | December 2, 2003, 3:13 AM |
iago | To quote the man page: [quote]GETS(3) Linux Programmer's Manual GETS(3) NAME fgetc, fgets, getc, getchar, gets, ungetc - input of char- acters and strings SYNOPSIS #include <stdio.h> [...] int getc(FILE *stream);[/quote] getc returns and int and your storing it in a char. Bad? | December 2, 2003, 3:17 AM |
Skywing | [quote author=DarkMinion link=board=30;threadid=3985;start=15#msg33103 date=1070334802] Maybe it would help if you said what problem you were experiencing? Ballpark? [/quote] The problem isn't related to the printf call failing or outputting the return value of isspace incorrectly because I used %u. For instance, the same problem would occur if you were using C++ and did std::cout << isspace(ch) << endl;. I'm not sure what else I can say without giving it away. | December 2, 2003, 3:18 AM |
DarkMinion | Bad iago copying what I just said. | December 2, 2003, 3:18 AM |
iago | [quote author=DarkMinion link=board=30;threadid=3985;start=15#msg33107 date=1070335134] Bad iago copying what I just said. [/quote] Not really.. I actually said what the problem IS. :P | December 2, 2003, 3:20 AM |
Skywing | Note that the topic is What can go wrong with this code? and not What does this code do wrong?. If you see something that you think is wrong, you should be able to tell about the consequences of that and soforth... | December 2, 2003, 3:20 AM |
iago | Well, if getc returns something greater than 255 (I don't know how this would happen, but it CAN since it's an <int>), it will overflow the char. I don't think anything with break, besides getting an invalid value in ch. | December 2, 2003, 3:22 AM |
Skywing | [quote author=iago link=board=30;threadid=3985;start=15#msg33110 date=1070335347] Well, if getc returns something greater than 255 (I don't know how this would happen, but it CAN since it's an <int>), it will overflow the char. I don't think anything with break, besides getting an invalid value in ch. [/quote] I think that in the case of an overflow, ch will get the return value of getc modulo max_value_of_char, possibly adjusting for signedness. In any case, I don't think it'll have an "invalid" value (assume all values of a char are "valid" characters). | December 2, 2003, 3:25 AM |
DarkMinion | Actually iago, you really did just say what I just said :P But according to Sky, it's wrong | December 2, 2003, 3:32 AM |
iago | True, but if you press character 0x120 (look carefully, it's in the corner), and it modulos it by sizeof(char), it will incorrectly identify it as a space. Another thing I was thinking, \n is made up of 2 characters on some computers; will that affect the output, or is 0x0d 0x0a read in as a single character? [edit] fixed \n hex.. oops :) | December 2, 2003, 3:32 AM |
DarkMinion | I think getc() reads it as 0x0a | December 2, 2003, 3:38 AM |
DarkMinion | Ok, here is A problem that I have found... I ran this code (modified a little to show the value of ch): [code] #include <stdio.h> #include <ctype.h> int main(int ac, char **av) { char ch = getc(stdin); printf("isspace(ch(%u)) returns %u.\n", ch, isspace(ch)); return 0; } [/code] If you input the character alt+160, you get these results: [img]http://dmbot.virtualave.net/problem.png[/img] What happens is 0xffffff gets stored in ch, and that screws up the compilation of isspace() Declaring ch as int fixes this problem... | December 2, 2003, 5:54 PM |
iago | Another potential problem is that the user can enter A LOT of crap before pressing enter. This might overflow some internal buffer and break the code, depending on the OS and the implementation of the stdin buffer. | December 2, 2003, 6:03 PM |
Skywing | Iago is totally off. This isn't a problem with input buffering at all. The character in question could have come from anywhere. | December 2, 2003, 6:33 PM |
iago | Which character in question? | December 2, 2003, 6:35 PM |
DarkMinion | So...am I right or wrong? | December 2, 2003, 7:51 PM |
Skywing | [quote author=DarkMinion link=board=30;threadid=3985;start=30#msg33230 date=1070394686] So...am I right or wrong? [/quote] You found something, but I'm not sure if you recognized what it would cause. | December 2, 2003, 7:52 PM |
iago | Well, the problem that DM's thing shows is caused by sign extention, it seems. So any character >=0x80 would be sign extended. But I don't see that causing a problem? | December 2, 2003, 8:08 PM |
Adron | I stand by my original answer. If the purpose of the code is to work the way the code is written, then there can be no error as long as it compiles. If you want us to find an error, you'll have to define what the code is supposed to do. Well, and the pits I pointed out. | December 2, 2003, 10:21 PM |
Skywing | [quote author=Adron link=board=30;threadid=3985;start=30#msg33265 date=1070403671] I stand by my original answer. If the purpose of the code is to work the way the code is written, then there can be no error as long as it compiles. If you want us to find an error, you'll have to define what the code is supposed to do. Well, and the pits I pointed out. [/quote] The code is supposed to determine if a character is whitespace or not. | December 2, 2003, 10:53 PM |
DarkMinion | Yeah, and alt+0160 is whitespace, but using your code, isspace says it's not! Problem!!! I win!!! :P | December 2, 2003, 10:59 PM |
iago | It's because the parameter to isspace is an int, and ch is a char, so ch is being sign-extended :) | December 2, 2003, 11:04 PM |
Adron | There's no risk of spaces, tabs, formfeeds, linefeeds or carriage returns ever getting sign extended though? The manual says: [quote] Return Value isspace returns a non-zero value if c is a white-space character (0x09 – 0x0D or 0x20). iswspace returns a non-zero value if c is a wide character that corresponds to a standard white-space character or is one of an implementation-defined set of wide characters for which iswalnum is false. Each of these routines returns 0 if c does not satisfy the test condition. [/quote] | December 2, 2003, 11:23 PM |
Moonshine | There may be potential problems if one doesn't do bounds checking on 'ch', if you link with the debug CRT library. [quote] When used with a debug CRT library, isspace will display a CRT assert if passed a parameter that isn't EOF or in the range of 0 through 0xFF. When used with a debug CRT library, isspace will use the parameter as an index into an array, with undefined results if the parameter isn't EOF or in the range of 0 through 0xFF.[/quote] | December 3, 2003, 2:51 AM |
Skywing | The problem is that if char is signed, and the character ch is a negative number, it will be sign extended to a negative integer (which is probably not going to be equal to EOF) when passed to isspace. This is out of range input for isspace and causes undefined behavior. One fix is to do the following: isspace((unsigned char)ch) The problem is perhaps more insidious if a char array had been read in via gets or fgets and you had a loop that went through each character checking to see if that character is whitespace or not. In that case, you wouldn't even have the possible warning that something could be up due to getc() returning an int and possibly returning EOF. | December 3, 2003, 3:44 AM |
iago | I dont' see how that would create a problem, though, since it still correctly identifies everything that the documentation says it will... | December 3, 2003, 7:42 AM |
Skywing | [quote author=iago link=board=30;threadid=3985;start=45#msg33379 date=1070437321] I dont' see how that would create a problem, though, since it still correctly identifies everything that the documentation says it will... [/quote] The operation of the function is undefined when given a negative integer that is not EOF. This means that it could possibly identify your input as whitespace even if it is not whitespace, or even crash. Consider a simple implementation somewhat like this...: [code] int isspace(int ch) { char table[255] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 1, ...}; if(ch == EOF) ... else return table[ch]; } [/code] If you pass a negative value that isn't EOF, it will be reading from memory that isn't in the table. Of course, exactly what will happen depends on how your particular isspace implementation works, but it probably won't be what you want. | December 3, 2003, 1:05 PM |
iago | Wouldn't it correctly idenfify the -1 as EOF? Assuming you're using that as a terminator.. | December 3, 2003, 1:19 PM |
Skywing | [quote author=iago link=board=30;threadid=3985;start=45#msg33389 date=1070457596] Wouldn't it correctly idenfify the -1 as EOF? Assuming you're using that as a terminator.. [/quote] "If you pass a negative value that isn't EOF, it will be reading from memory that isn't in the table." | December 3, 2003, 1:32 PM |
iago | hm, I didnt' read your post right. I see what you mean now! I didn't notice you declared table[b][255][b]. You're right, then, that if you try to reference a character >127, it'll reference a negative value. That was my bad :) | December 3, 2003, 1:38 PM |
Adron | The definition for the isspace I checked didn't mention not supporting negative numbers. It just said it will return true for those particular numbers and false for all others. | December 3, 2003, 4:16 PM |
iago | I agree with Adron here. If you had given your more recent code instead of using isspace, it would have been more clear. As it was, everything acted completely predictably in the manner that the functions involved say they will. | December 3, 2003, 6:38 PM |