United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: 5088563 Matcher.find throws StringIndexOutOfBoundsException if pattern is missing ']'
5088563 : Matcher.find throws StringIndexOutOfBoundsException if pattern is missing ']'

Details
Type:
Bug
Submit Date:
2004-08-18
Status:
Resolved
Updated Date:
2007-08-17
Project Name:
JDK
Resolved Date:
2007-06-12
Component:
core-libs
OS:
generic
Sub-Component:
java.util.regex
CPU:
generic
Priority:
P4
Resolution:
Fixed
Affected Versions:
5.0,5.0u12
Fixed Versions:
5.0u14

Related Reports

Sub Tasks

Description
Tried on Solaris-9 JDK 1.5.0-beta3-b58

Pattern : "\p{javaMirrored}\P{javaMirrored}+\p{javaMirrored}" 
Input : sdjhjshdka{dhhd}sjdhjs 
Works fine for the above input
  
But for the Input "sdjhjshdka{dhhd}sjdhjssdkjd[sdsd"
Throws the following Exception :
java.lang.StringIndexOutOfBoundsException: String index out of range: 32
	at java.lang.String.charAt(String.java:558)
	at java.util.regex.Pattern.countChars(Pattern.java:2791)
	at java.util.regex.Pattern.access$000(Pattern.java:595)
	at java.util.regex.Pattern$Not.match(Pattern.java:3764)
	at java.util.regex.Pattern$Curly.match0(Pattern.java:4222)
	at java.util.regex.Pattern$Curly.match(Pattern.java:4196)
	at java.util.regex.Pattern$JavaTypeClass.match(Pattern.java:3595)
	at java.util.regex.Pattern$Start.match(Pattern.java:3019)
	at java.util.regex.Matcher.search(Matcher.java:1092)
	at java.util.regex.Matcher.find(Matcher.java:528)
	at Test1.check(Test1.java:9)
	at Test1.main(Test1.java:20)

Test Case :
execute java Test1 "\p{javaMirrored}\P{javaMirrored}+\p{javaMirrored}" "sdjhjshdka{dhhd}sjdhjssdkjd[sdsd"

import java.util.regex.*;

public class Test1 {
   
   public void check(String str1, String str2) {
      try {
            Pattern p = Pattern.compile(str1);
            Matcher m = p.matcher(str2);
            while(m.find()) {
                System.out.println(m.group());
            }
      }catch(Exception e) {
         e.printStackTrace();
      }
   }
   
   public static void main(String args[]) {
             
       Test1 ref = new Test1();
       ref.check(args[0], args[1]); 
   }

}

                                    

Comments
SUGGESTED FIX

Basically the change is
*** src/share/classes/java/util/regex/Pattern.java-     Wed Feb 14 13:55:32 2007
--- src/share/classes/java/util/regex/Pattern.java      Mon Apr 23 12:16:36 2007
*** 3764,3775 ****
--- 3764,3778 ----
          Node atom;
          Not(Node atom) {
              this.atom = atom;
          }
          boolean match(Matcher matcher, int i, CharSequence seq) {
+             if (i < matcher.to)
                  return !atom.match(matcher, i, seq)
                    && next.match(matcher, i+countChars(seq, i, 1), seq);
+           matcher.hitEnd = true;
+           return false;
          }
          boolean study(TreeInfo info) {
              info.minLength++;
              info.maxLength++;
              return next.study(info);
                                     
2007-04-23
EVALUATION

This error arises, although other patterns are used.

Test case:
public final class Bug5088563{
    private static String REGEX;
    private static String INPUT;
    private static Pattern pattern;
    private static Matcher matcher;
    private static boolean found;

    public static void main(String[] argv) {
        initResources();
        processTest();
    }

    private static void initResources() {
       try {
           REGEX = "\\p{javaWhitespace}\\P{javaWhitespace}+\\p{javaWhitespace}";
           INPUT = "ASDASSAs    dsdssad fssdsASAd   sdsd sdsd  ddd";
       } catch (Exception ioe) {
             ioe.printStackTrace();
       }

        pattern = Pattern.compile(REGEX);
        matcher = pattern.matcher(INPUT);
        System.out.println("Current REGEX is: "+REGEX);
        System.out.println("Current INPUT is: "+INPUT);
    }

    private static void processTest() {
         try{
                while(matcher.find()) {
                    System.out.println("I found the text \"" + matcher.group() +
                                   "\" starting at index " + matcher.start() +
                                   " and ending at index " + matcher.end() + ".");
                    found = true;
                }
                if(!found)
                        System.out.println("No match found.");
                
             System.out.println("Test case passed");
             System.exit(0);
         }catch(Exception exe){
             exe.printStackTrace();
             System.out.println("Test case failed");      
             System.exit(1);
         }
    }
}

Output is:
Current REGEX is: \p{javaWhitespace}\P{javaWhitespace}+\p{javaWhitespace}
Current INPUT is: ASDASSAs    dsdssad fssdsASAd   sdsd sdsd  ddd
I found the text " dsdssad " starting at index 11 and ending at index 20.
I found the text " sdsd " starting at index 31 and ending at index 37.
java.lang.StringIndexOutOfBoundsException: String index out of range: 46
        at java.lang.String.charAt(String.java:558)
        at java.util.regex.Pattern.countChars(Pattern.java:2791)
        at java.util.regex.Pattern.access$000(Pattern.java:595)
        at java.util.regex.Pattern$Not.match(Pattern.java:3769)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4228)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4202)
        at java.util.regex.Pattern$JavaTypeClass.match(Pattern.java:3600)
        at java.util.regex.Pattern$Start.match(Pattern.java:3019)
        at java.util.regex.Matcher.search(Matcher.java:1092)
        at java.util.regex.Matcher.find(Matcher.java:528)
        at Bug5088563.processTest(Bug5088563.java:59)
        at Bug5088563.main(Bug5088563.java:38)
Test case failed

Since initial bug was filed for tiger and got closed as not reproducible but was not fixed there, it needs to be reopen.
                                     
2007-02-16
SUGGESTED FIX

boolean match(Matcher matcher, int i, CharSequence seq) {
            if (i < matcher.to)
                return !atom.match(matcher, i, seq)
		    && next.match(matcher, i+countChars(seq, i, 1), seq);
	    matcher.hitEnd = true;
	    return false;
        }
                                     
2006-05-20
EVALUATION

This problem has been "accidently" fixed in Mustang by other regex rewrite work.
Root cause is the incorrect Not class implementation in 5.0 codebase, see the
suggested fix for a possible solution, if 5.0u fix is desired.
                                     
2006-05-20
EVALUATION

Likely a documentation issue.

-- iag@sfbay 2004-09-18
                                     
2004-09-18



Hardware and Software, Engineered to Work Together