US8094872B1 - Three-dimensional wavelet based video fingerprinting - Google Patents

Three-dimensional wavelet based video fingerprinting Download PDF

Info

Publication number
US8094872B1
US8094872B1 US11/746,339 US74633907A US8094872B1 US 8094872 B1 US8094872 B1 US 8094872B1 US 74633907 A US74633907 A US 74633907A US 8094872 B1 US8094872 B1 US 8094872B1
Authority
US
United States
Prior art keywords
video
frames
fingerprint
segment
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US11/746,339
Inventor
Jay Yagnik
Henry A. Rowley
Sergey Ioffe
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google LLC filed Critical Google LLC
Priority to US11/746,339 priority Critical patent/US8094872B1/en
Assigned to GOOGLE INC. reassignment GOOGLE INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: YAGNIK, JAY, IOFFE, SERGEY, ROWLEY, HENRY A.
Priority to US12/968,825 priority patent/US8611689B1/en
Priority to US13/250,494 priority patent/US8340449B1/en
Application granted granted Critical
Publication of US8094872B1 publication Critical patent/US8094872B1/en
Assigned to GOOGLE LLC reassignment GOOGLE LLC CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: GOOGLE INC.
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content

Definitions

  • the invention generally relates to video processing, and more specifically to video fingerprinting.
  • Electronic video libraries may contain thousands or millions of video files, making management of these libraries an extremely challenging task.
  • the challenges become particularly significant in the case of online video sharing sites where many users can freely upload video content.
  • users upload unauthorized copies of copyrighted video content, and as such, video hosting sites need a mechanism for identifying and removing these unauthorized copies.
  • While some files may be identified by file name or other information provided by the user, this identification information may be incorrect or insufficient to correctly identify the video.
  • An alternate approach of using humans to manually identifying video content is expensive and time consuming.
  • video sharing sites may upload multiple copies of video content to the site. For example, popular items such as music videos may be uploaded many times by multiple users. This wastes storage space and becomes a significant expense to the host.
  • a third problem is that due to the large number of files, it is very difficult to organize the video library based on video content. Thus, search results may have multiple copies of the same or very similar videos making the results difficult to navigate for a user.
  • Various methods have been used to automatically detect similarities between video files based on their video content.
  • various identification techniques such as an MD5 hash on the video file
  • a digital “fingerprint” is generated by applying a hash-based fingerprint function to a bit sequence of the video file; this generates a fixed-length monolithic bit pattern—the fingerprint—that uniquely identifies the file based on the input bit sequence.
  • fingerprints for files are compared in order to detect exact bit-for-bit matches between files.
  • a fingerprint can be computed for only the first frame of video, or for a subset of video frames.
  • each of these methods often fail to identify videos uploaded by different users with small variations that change the exact bit sequences of the video files.
  • videos may be uploaded from different sources and may vary slightly in how they are compressed and decompressed.
  • different videos may have different source resolutions, start and stop times, frame rates, and so on, any of which will change the exact bit sequence of the file, and thereby prevent them from being identified as a copy of an existing file.
  • an improved technique is needed for finding similarities between videos and detecting duplicate content based on the perceived visual content of the video.
  • a technique is needed for comparing videos that is unaffected by small differences in compression factors, source resolutions, start and stop times, frame rates, and so on.
  • the technique should be able to compare and match videos automatically without relying on manual classification by humans.
  • a method and system generates and compares fingerprints for videos in a video library using fingerprints that represent spatial information within certain frames of the video, as well as sequential information between frames.
  • the methods for generating video fingerprints provide a compact representation of the spatial and sequential characteristics that can be used to quickly and efficiently identify video content.
  • the methods also allow for comparing videos by using their fingerprints in order to find a particular video with matching content (such as, for example, to find and remove copyright protected videos or to find and remove duplicates).
  • the methods enable organizing and/or indexing a video library based on their visual content by using video fingerprints. This can provide improved display of search results by grouping videos with matching content.
  • a video fingerprint is generated by applying a three-dimensional transform to a video segment.
  • the video fingerprint represents both the spatial characteristics within the frames of the video segment and sequential characteristics between frames; the transform is said to be three-dimensional because the spatial information within frames provides two dimensions of information, while the sequential information provide the third dimension of temporal information.
  • the fingerprint is based on the spatial and sequential characteristics of the video segment rather than an exact bit sequence, video content can be effectively compared even when videos have variations in compression factors, source resolutions, start and stop times, frame rates, and so on.
  • a set of fingerprints associated with each segment of a video provide a fingerprint sequence for the video.
  • the set of video fingerprints for a received video can be compared against reference fingerprints for videos stored in a reference database. In this manner, matching videos can be efficiently located. This is useful for at least two reasons. First, when a video is uploaded to a file sharing site, it may be immediately checked against all videos in the library. If matches are found, the video can be properly indexed in order to eliminate presentation of duplicates in search results. Alternatively, it may be desirable to discard the uploaded video if any matches are found and only accept new entries to the library that are unique. Second, if a video is known to be copyright protected, its fingerprint can be used to efficiently search for visually identical videos in the library so that copyrighted material can be removed.
  • a system for detecting duplicate video content includes an ingest server, a fingerprinting module, an indexing module, a matching module, and a reference database.
  • the ingest server receives an input video from a video source and provides the video to the fingerprinting module, which generates a fingerprint sequence for the ingest video.
  • Each fingerprint in the fingerprint sequence is indexed by the indexing module according to one or more hash processes which selectively reduce the dimensionality of the fingerprint data.
  • a matching module compares fingerprints and/or fingerprint sequences in the reference database to the fingerprint sequence associated with the ingest video and determines if a match is found.
  • the matching module may be used both to locate particular video content from a query and to organize video search results based on their content.
  • a system for generating a video fingerprint sequence includes a normalization module, a segmenting module, a transform module, and a quantization module.
  • the normalization module converts received videos to a standard format for fingerprinting.
  • the segmenting module segments the normalized video into a number of segments, each segment including a number of frames.
  • Each segment of frames is separately transformed by the transform module in the horizontal, vertical, and time dimensions.
  • This three-dimensional transform computes frequency information about edge differences in the spatial and temporal dimensions. The result is a three-dimensional array of coefficients that will be unique to the spatial and sequential characteristics of the group of frames.
  • a Haar wavelet transform provides one example of a transform that can be used for this purpose; various other transforms may also be utilized.
  • a quantizing module quantizes the three-dimensionally transformed segment in order to reduce the amount of data while still preserving the spatial and sequential characteristics of the video.
  • the quantized transform results provide a video fingerprint for each video segment.
  • a fingerprint sequence for the video is formed from the ordered set of fingerprints of the video segments.
  • FIG. 1 is a high level block diagram illustrating a system for comparing video content in video library.
  • FIG. 2 is a block diagram illustrating an architecture for generating a video fingerprint.
  • FIG. 3 is a diagram illustrating a video structure as a series of frames.
  • FIG. 4 is a flowchart illustrating a process for generating a video fingerprint.
  • FIG. 5 is a diagram illustrating a technique for segmenting a video into overlapping segments.
  • FIG. 6 is a flowchart illustrating a process for computing a transform used in generating a video fingerprint.
  • FIG. 7 is a diagram illustrating computation of a transform used in generating a video fingerprint.
  • FIG. 8 is a flowchart illustrating a process for indexing video fingerprints.
  • FIG. 9 illustrates an example of indexed video segments.
  • FIG. 10 is a flowchart illustrating a process for matching video fingerprints.
  • FIG. 1 is a high-level block diagram illustrating a system for comparing video content.
  • the system comprises an ingest server 104 , a fingerprinting module 106 , an indexing module 108 , a matching module 110 , a reference database 112 , and a video library 116 .
  • a fingerprinting module 106 a fingerprinting module 106 , an indexing module 108 , a matching module 110 , a reference database 112 , and a video library 116 .
  • different or additional modules may be used.
  • the ingest server 104 is adapted to receive one or more videos from a video source 102 .
  • the video source 102 can be, for example, a client computer coupled to the ingest server 104 through a network. In this configuration, a user can upload video content to the ingest server 104 from a remote location.
  • the video source 102 can be a database or other storage device coupled to the ingest server 104 .
  • the video source 102 can be a video storage medium such as a DVD, CD-ROM, Digital Video Recorder (DVR), hard drive, Flash memory, or other memory.
  • the ingest server 104 may also be coupled directly to a video capture system such as a video camera.
  • the ingest server 104 stores the received videos to the video library 116 .
  • the ingest server 104 can also pass received videos directly to the fingerprinting module 106 for fingerprinting immediately upon receipt.
  • the ingest server 104 pre-processes the received video to convert it to a standard format for storage in the video library 116 .
  • the ingest server 104 can convert the frame rate, frame size, and color depth of a received video to predetermined formats.
  • storage format can be Adobe FLASH®, with a frame size of 320 ⁇ 240 at 15 fps, and 8 bit color.
  • the fingerprinting module 106 receives a video from the ingest server 104 or from the video library 116 and generates a sequence of fingerprints associated with the video. Typically, the fingerprint module 106 divides the received video into multiple overlapping segments with each segment comprising a number of video frames, and a fingerprint is separately generated for each segment. Each fingerprint compactly represents spatial information within the group of video frames in the video segment and sequential characteristics between frames of the video segment. The fingerprint uniquely identifies a video segment based on its visual content such that minor variations due to compression, de-compression, noise, frame rate, start and stop time, source resolutions and so on do not significantly affect the fingerprint generated for the video segment. The complete ordered set of video fingerprints for the segments of a video provides a fingerprint sequence for the video.
  • the indexing module 108 receives the video fingerprint sequences for each video from fingerprinting module 106 and indexes the fingerprint sequences into the reference database 112 .
  • the indexing process can use a variety of different hash techniques to generate a signature for a fingerprint that uniquely identifies the fingerprint while fixing the size of the fingerprint data.
  • the signature is broken into signature blocks and indexed in hash tables. Indexing beneficially reduces the number of bit comparisons needed to compare two fingerprints. Thus, searches for matching fingerprints can be accelerated relative to direct bit-for-bit comparisons of fingerprints.
  • the matching module 110 compares videos or video segments and generates a matching score indicating the likelihood of a match.
  • the matching module 110 compares the fingerprint sequence of an ingest video to reference fingerprint sequences stored in the reference database 112 .
  • the matching module 110 compares fingerprint sequences in the reference database 112 corresponding to two or more videos stored in video library 116 .
  • the matching module 110 may further receive a search query from a user requesting particular content and output a video 118 from the video library 116 that matches the query 114 .
  • the video library 116 is a storage device for storing a library of videos.
  • the video library 116 may be any device capable of storing data, such as, for example, a file server, a hard drive, a writeable compact disk (CD) or DVD, or a solid-state memory device.
  • Videos in the video library 116 are generally received from the ingest server 104 and can be outputted to the fingerprinting module 106 for fingerprinting. Videos are also outputted 118 by the matching module 110 that are relevant to a search query 114 .
  • the reference database 112 stores the indexed fingerprints for each video in the video library 116 . Each entry in the reference database 112 corresponds to signature blocks generated in the indexing process. Each entry is mapped to unique identifiers of the video segments corresponding to each signature block.
  • the reference database 112 can be searched by the matching module 110 to quickly compare fingerprints and/or fingerprint sequences.
  • a first scenario enables the system to query-by-video to find identical or similar videos to a selected video.
  • a system operator provides an input query 114 to the matching module 110 .
  • the input query 114 is in the form of a video having particular content of interest such as, for example, video content that is copyright protected.
  • a fingerprint sequence is generated for the copyright protected video and the reference database 112 is searched for matching fingerprints. Unauthorized copies can then be removed from the video library 116 (or otherwise processed) if the matching module 110 detects a match.
  • new uploads can be automatically screened for unauthorized copies of known copyrighted works.
  • a newly uploaded video is fingerprinted and the fingerprint sequence is compared against fingerprint sequences for the known copyrighted videos. Then, matching uploads are blocked from storage in the video library 116 .
  • the video can be processed in pieces as it is received so that so that the full video need not be received before processing begins.
  • the system is used to detect and remove multiple copies of video content from the video library 116 .
  • Duplicate or near duplicate videos may be found within the video library 116 , or new videos uploaded by the ingest server 104 may be automatically compared against videos in the video library 116 .
  • Duplicate videos found in the video library 116 are removed in order to save storage space.
  • the new video is simply discarded.
  • the system can be used to provide organized search results of videos.
  • a user provides an input query 114 and the matching module 110 returns relevant video results.
  • the input query 114 can be in the form a conventional text-based search query or can be in the form of a video file as described previously.
  • video results are compared to one another by the matching module 110 and matching videos are grouped together in the search results.
  • the fingerprinting module 106 is adapted to receive an input video that has been pre-processed by the ingest server 104 , and generate one or more fingerprints representing spatial and sequential characteristics associated with the video.
  • the fingerprinting module 106 comprises a normalization module, 210 a segmenting module 220 , a transform module 230 , and a quantization module 240 .
  • the fingerprinting module 106 can have additional or different modules than those illustrated.
  • FIG. 3 An example structure for a video received by the fingerprinting module 106 is provided in FIG. 3 .
  • the video comprises a series of frames 300 .
  • Each frame 300 comprises an image having a plurality of pixels arranged in a two-dimensional grid (for example, in an X direction and a Y direction).
  • the frames 300 are also arranged sequentially in time (the t direction).
  • a video comprises both spatial information, defined by the arrangement of pixels in the X and Y directions, and sequential or temporal information defined by how the pixels change throughout the time (t) dimension.
  • the normalization module 210 generally standardizes the data to be processed during fingerprinting.
  • the normalization module 210 includes a frame rate converter 212 , a frame size converter 214 and color converter 216 to normalize video to a predetermined format for fingerprinting. Converting video to a standardized fingerprint format ensures that videos are consistent and can produce comparable results. Often, frame rate, frame size, and color information are reduced by the normalization module 210 in order to improve the speed and power efficiency of the fingerprinting process. For example, the normalization module 210 can convert the video to luminance (grayscale) values without color, reduce the frame rate to 15 fps, and reduce the frame size to 64 ⁇ 64.
  • luminance grayscale
  • the number of pixels in each row and column of the frame size is preferably a power of 2 (e.g., 64 ⁇ 64) but any frame size is possible.
  • Each of the standard formats used by the normalization module 210 may be predetermined or may be determined dynamically based on various constraints such as, for example, available power, available bandwidth, or characteristics of the received video.
  • the segmenting module 220 receives the normalized video from the normalization module 210 and divides the video into a number of segments with each segment including a number of frames.
  • the segments may be stored, for example, in temporary buffers and outputted separately to the transform module 230 .
  • the segments preferably overlap by some number of frames.
  • the transform module 230 operates on the video segments obtained from the segmenting module 220 .
  • the transform module 230 comprises a row transformer 232 , a column transformer 234 , and a time column transformer 236 for performing a three-dimensional transform on each video segment.
  • This three-dimensional transform computes frequency information about edge differences in two spatial dimensions and one temporal dimension. Because the transform results are based on the spatial and sequential characteristics rather than an exact bit sequence, the fingerprint can identify a video segment based on its content even in the presence of variations in compression factors, source resolutions, start and stop times, frame rates, and so on.
  • the output of the transform module 230 is a three-dimensional array of coefficients that will be unique to the spatial and sequential characteristics of the group of frames in each video segment.
  • a quantization module 240 quantizes the three-dimensionally transformed segment in order to standardize the data size while still preserving the spatial and sequential characteristics of the video. Additionally, the quantization module 240 encodes and flattens the transformed coefficient array to a one-dimensional bit vector. The one-dimensional bit vector provides a fingerprint for an individual video segment.
  • FIG. 4 a flowchart illustrates a process for generating a fingerprint sequence using the fingerprinting module 106 .
  • An input video is received by the fingerprinting module 106 and normalized 402 by the normalization module 210 .
  • the frame size converter 214 scales down the size of the received video frames.
  • the converted frames have a fixed number of pixels across the row and column of each frame.
  • the color converter 216 generally reduces the color information in the video for example by converting to a grayscale such that each pixel is represented by a single luminance value.
  • the segmenting module 220 separates 404 the normalized video into a number of segments of consecutive video frames that may be overlapping.
  • each video segment includes the same number of video frames, typically, the number being equal to 2 n where, n is an integer.
  • the segments of video frames preferably overlap by a fixed number of frames.
  • FIG. 5 an axis is illustrated representing the numbered sequence of frames in a video file, with three segments S 1 , S 2 , and S 3 , each having 64 frames, and having 16 frames between the start of each segment.
  • a first segment S 1 is illustrated comprising frames 0 - 63
  • a second segment S 2 comprises frames 16 - 79
  • a third segment S 3 comprises frames 32 - 95 . Additional segments may be similarly obtained from the video.
  • a video segment may comprise a different number of frames and segments may overlap by any number of frames.
  • a video may be segmented into segments of varying length or varying overlap.
  • the transform module 230 transforms 406 the video segment by applying a three-dimensional transform to the group of frames in the video segment.
  • a transform is applied to each row, column, and time column for a video segment by the row transformer 232 , column transformer 234 , and time column transformer 236 respectively.
  • a row refers to a set of pixels aligned in the horizontal (X) direction of a video frame and a column refers to a set of pixels aligned in a vertical direction (Y) of a video frame.
  • a time column refers to a set of pixels having the same horizontal and vertical location within a frame, but belonging to different frames (Z direction).
  • the row transformer, column transformer, and time column transformer apply identical mathematical functions but operate on different dimensions of the received video segment.
  • the row transformer 232 , column transformer 234 , and time column transformer 236 each apply a Haar wavelet transform across their respective dimensions.
  • different types of transforms may be used such as, for example, a Gabor transform, or other related transform.
  • FIG. 6 illustrates an example process for transforming a row of the video segment by applying a Haar wavelet transform. It is noted that different processes other than the process illustrated can be used to compute a Haar wavelet transform. The process in FIG. 6 can be viewed in conjunction with FIG. 7 which graphically illustrates the intermediate results of the transform at various stages of the process.
  • the Haar wavelet transform is conceptually illustrated in FIG. 7 for a row of 8 pixels. It will be apparent to one of ordinary skill that described technique can be extended to any size row.
  • the transform is not limited to a row, but can be similarly applied to any column or time column.
  • a row of pixels is received 602 by the row transformer 232 .
  • the row 702 comprises 8 pixels having values A-H.
  • the values A-H may represent, for example, the luminance value of the pixels or some other characteristic of the pixels such as color information.
  • a new row 704 is created 604 with a length equal to the length of the original row 702 .
  • the new row 704 may be, for example, a temporary buffer in the row transformer 232 and does not correspond to an actual row of pixels in the video segment. Pixels in the previous row (i.e. the original row 702 ) are grouped 606 into pairs, with each pair comprising two adjacent pixels.
  • pixel A and B form a first pair 712
  • pixels C and D form a second pair, and so on.
  • Values in the first section of the new row are set 608 to be the sums of each pair in the previous row.
  • the first entry is set to (A+B), the second entry set to (C+D), and so on for the first four elements of row 704 .
  • Values in the second section are set 610 to the differences of each pair in the previous row 702 .
  • the 5 th entry is set to (A ⁇ B)
  • the 6 th entry is set to (C ⁇ D), and so on.
  • all or some of the entries may be scaled by a constant value. Scaling by one-half in the summed entries, for example, will provide an average of the values.
  • step 614 the next row 706 is created.
  • all values are copied 616 to from the values in the previous row 704 except for values in the 1 st section.
  • entries 5 - 8 in row 706 are equivalent to entries 608 in row 704 .
  • the first section of the new row 706 is divided 618 into a new first and second section of equal size.
  • the process then repeats 620 back to step 606 and iterates until the first section is a single entry and cannot be divided any further.
  • the final values of the entries are illustrated in row 708 .
  • the final row 708 then overwrites 622 the original row 702 .
  • transforms for the other dimensions are applied (column, and time column), they are applied to the results of the transform in the previous dimension (not the original pixels). It is noted that in different embodiments, transforms may be applied to the dimensions in any order. Furthermore, in alternate variations, the sums and/or differences of pairs can instead include some other aggregate function, such as an average function.
  • the resulting values in 708 provide information relating to edge differences in the original row of pixels 702 .
  • the first value of 708 represents a sum of all the pixels in the original row 702 .
  • the second value represents the difference between the sum of values in the first half and the sum of values in the second half of original row 702 .
  • the third value represent the difference between the first quarter and second quarter, the fourth value represents the difference between the third quarter and fourth quarter, and so on.
  • Alternate techniques can be used to compute the Haar wavelet transform. For example techniques using boxlets, summed-area tables, or integral images may be utilized.
  • partial sums are first formed across the original row of pixels. In the partial sum, the value stored at a particular pixel location is equal to the sum of that pixel's luminance value plus the luminance values of all previous pixels. It is observed that the values in 708 are the differences between the sums of adjacent ranges of the original pixels. Then, the entries in the final result 708 can be computed directly by the differences of two partial sums.
  • the transform process of FIGS. 6 and 7 is repeated for each row, column, and time column in the video segment by the respective transform modules.
  • the result in a three-dimensional array of coefficients that represents the spatial and sequential characteristics of all frames in the segment, and which is outputted by the transform module 230 .
  • the quantization module 240 quantizes 408 the three-dimensionally transformed segment.
  • the quantization module 240 determines the N coefficients with the largest absolute values; N may be a predetermined number or may be determined dynamically based on various constraints.
  • the quantization module 240 quantizes the N coefficients to +1 or ⁇ 1 by preserving the signs of the coefficients and sets the remaining coefficients to zero.
  • coefficients are quantized by comparing the magnitude of each coefficient to a predetermined threshold value. Any coefficient with a magnitude greater than the threshold value is quantized to +1 or ⁇ 1 by preserving its sign, and the remaining coefficients are set to zero.
  • the quantization module 240 quantizes only the N greatest coefficients that have a magnitude greater than a threshold value to +1 or ⁇ 1, and sets the remaining coefficients to zero.
  • the quantization module 240 encodes 410 the three-dimensional coefficient array and flattens the array to a one-dimensional bit vector. If, for example, each bit is quantized to +1, ⁇ 1, or 0 a two-bit encoding scheme uses the bits 10 for +1, 01 for ⁇ 1, and 00 for zero. Various other encoding techniques are possible without departing from the scope of the invention.
  • the output of the quantization module 240 is a quantized and encoded bit vector that forms a fingerprint for a single video segment.
  • the fingerprinting process then repeats 412 for each video segment in the video.
  • the ordered set of video fingerprints generated by the process forms a fingerprint sequence for the entire video file.
  • a fingerprint sequence can be compared to a reference fingerprint sequence by counting the number of differences between the bits in the respective sequences. This comparison provides a good indication of the similarity between the videos associated with the fingerprint sequences.
  • the fingerprints are indexed by the indexing module 108 .
  • An example process for indexing uses a min-hash process as illustrated in FIG. 8 .
  • the min-hash process generates a “signature” for the video fingerprint by applying a set of P permutations to the bit values of the fingerprint.
  • the signature contains fewer bits than the full fingerprint but retains most of the information in the associated fingerprint.
  • the video fingerprint is in the form of a bit vector that represents the flattened quantized three-dimensional transform results for an individual segment.
  • the indexing module applies a number P permutations to the bits of the fingerprint.
  • Each permutation defines a bit re-arrangement (e.g., bit swap) of the bits of the fingerprint; the permutation may be a random permutation or algorithmic.
  • the permutations are preferably defined beforehand, but once defined the permutations are fixed and always applied in the same order.
  • the indexing module 108 receives 802 a fingerprint for a video segment.
  • a new bit vector is generated 806 by re-arranging the bits according to a first permutation P 1 .
  • a scanning module scans 808 for the location of the first bit value of “1” in the re-arranged bit vector and records 810 this location to a location vector. This process of permutation and location recording repeats 814 for all P permutations.
  • each received fingerprint will have the same set of P permutations applied in the same order.
  • the output is a location vector having P values, with each value indicating a location of the first bit value of “1” in the underlying fingerprint after applying each permutation. This set of locations provides the signature for the fingerprint.
  • each signature is divided into a number of signature blocks and each signature block is placed into a different hash table. For each entry in the hash tables, unique identifiers of any video segment that generates that particular signature block are stored with the corresponding signature block.
  • FIG. 9 illustrates an example of indexed fingerprints using the min-hash and locality sensitive hashing techniques described above.
  • VID 4 comprising 100 segments and VID 7 comprising 365 segments are shown.
  • a first signature 902 a corresponds to a first fingerprint of the second video segment of VID 4.
  • the signature 902 a is represented by a sequence of P locations (e.g., 11, 32, 11, 18 . . . ).
  • the signature is broken into signature blocks 906 of four locations each. According to various embodiments, different sized signature blocks are used.
  • a second signature 902 b corresponds to the third video segment of VID 7.
  • the first signature block in each signature 902 is mapped to table 1, the second signature block is mapped to table 2, and so on.
  • the tables store each signature block and a unique identifier for all video segments that generated each particular signature block.
  • the tables also associate an index number with each unique signature block representing an offset into the table, although the index number itself need not be explicitly stored.
  • index 1 corresponds to the signature block having the sequence 11, 32, 11, 18.
  • the signature block stored at index 1 corresponds to the sequence (563, 398, 13, 6). Because both VID 4 segment 2, and VID 7, segment 3 have this sequence as their second signature block, both segments are mapped to index 1.
  • each video segment can be assigned a unique identifier, which is used in these tables in place of the tuple (video, segment).
  • the matching module 110 can be used to efficiently compare and match fingerprints of video files. Using the hash techniques described above, videos can be compared simply by comparing the index values of their signature blocks for each segment, rather than performing a bit-for-bit comparison of the entire fingerprint sequence. An example matching process is illustrated in FIG. 10 .
  • a signature sequence (corresponding to the ordered signatures of a fingerprint sequence) for an ingested video is received by the matching module 110 .
  • Each signature block of a first signature in the signature sequence is hashed 1004 into the corresponding hash tables. For every matching signature block found in the table, a separate count is incremented for each unique video identifier associated with the matching signature block.
  • each reference fingerprint maintains a separate count indicating the number of signature blocks of the reference fingerprint that match signature blocks of the first fingerprint of the ingest video.
  • the counts are used to determine 1006 a matching score between the first fingerprint of the ingest video and each reference fingerprint of each video segment in the reference database 112 .
  • the matching scores are compared against a threshold value to determine 1008 all reference fingerprints having matching scores above the threshold. Reference fingerprints with matching scores above the threshold are designated as matching fingerprints. This process then repeats for each individual fingerprint of the fingerprint sequence of the ingest video.
  • the matching module 110 determines the reference video with the longest consecutive sequence of fingerprints that match the fingerprint sequence of the ingest video. Because each fingerprint corresponds to a time segment of video, this method determines a reference video that matches the ingest video over the longest consecutive time period.
  • a fixed length window of time (e.g., 15 seconds) is designated for the ingest video.
  • the fixed length window of time corresponds to a block of fingerprints in the fingerprint sequence of the ingest video.
  • time offsets are determined between each matching segment of the reference video and the corresponding segments of the ingest video.
  • Each matching pair of segments casts a “vote” for a particular time offset. The votes are counted across all matching pairs and the reference window with the highest number of votes is designated as the best match.
  • the systems and methods described above enable indexing a video library using video fingerprints and matching video content based on spatial and sequential characteristics of the video. This is particularly useful, for example, in finding and removing duplicate video content and preventing sharing of copyright protected content. Moreover, the methods can be performed automatically and are therefore more efficient and cost effective than conventional techniques.
  • the present invention has been described in particular detail with respect to a limited number of embodiments. Those of skill in the art will appreciate that the invention may additionally be practiced in other embodiments. First, the particular naming of the components, capitalization of terms, the attributes, data structures, or any other programming or structural aspect is not mandatory or significant, and the mechanisms that implement the invention or its features may have different names, formats, or protocols.
  • system may be implemented via a combination of hardware and software, as described, or entirely in hardware elements.
  • particular division of functionality between the various system components described herein is merely exemplary, and not mandatory; functions performed by a single system component may instead be performed by multiple components, and functions performed by multiple components may instead performed by a single component.
  • Certain aspects of the present invention include process steps and instructions described herein in the form of an algorithm. It should be noted that the process steps and instructions of the present invention could be embodied in software, firmware or hardware, and when embodied in software, could be downloaded to reside on and be operated from different platforms used by real time network operating systems.
  • the present invention also relates to an apparatus for performing the operations herein.
  • This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer.
  • a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
  • the computers referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.

Abstract

A method and system generates and compares fingerprints for videos in a video library. The video fingerprints provide a compact representation of the spatial and sequential characteristics of the video that can be used to quickly and efficiently identify video content. Because the fingerprints are based on spatial and sequential characteristics rather than exact bit sequences, visual content of videos can be effectively compared even when there are small differences between the videos in compression factors, source resolutions, start and stop times, frame rates, and so on. Comparison of video fingerprints can be used, for example, to search for and remove copyright protected videos from a video library. Further, duplicate videos can be detected and discarded in order to preserve storage space.

Description

BACKGROUND
1. Field of Art
The invention generally relates to video processing, and more specifically to video fingerprinting.
2. Description of the Related Art
Electronic video libraries may contain thousands or millions of video files, making management of these libraries an extremely challenging task. The challenges become particularly significant in the case of online video sharing sites where many users can freely upload video content. In some cases, users upload unauthorized copies of copyrighted video content, and as such, video hosting sites need a mechanism for identifying and removing these unauthorized copies. While some files may be identified by file name or other information provided by the user, this identification information may be incorrect or insufficient to correctly identify the video. An alternate approach of using humans to manually identifying video content is expensive and time consuming.
Another problem faced by video sharing sites is that users may upload multiple copies of video content to the site. For example, popular items such as music videos may be uploaded many times by multiple users. This wastes storage space and becomes a significant expense to the host. A third problem is that due to the large number of files, it is very difficult to organize the video library based on video content. Thus, search results may have multiple copies of the same or very similar videos making the results difficult to navigate for a user.
Various methods have been used to automatically detect similarities between video files based on their video content. In the past, various identification techniques (such as an MD5 hash on the video file) have been used to identify exact copies of video files. Generally, a digital “fingerprint” is generated by applying a hash-based fingerprint function to a bit sequence of the video file; this generates a fixed-length monolithic bit pattern—the fingerprint—that uniquely identifies the file based on the input bit sequence. Then, fingerprints for files are compared in order to detect exact bit-for-bit matches between files. Alternatively, instead of computing a fingerprint for the whole video file, a fingerprint can be computed for only the first frame of video, or for a subset of video frames. However, each of these methods often fail to identify videos uploaded by different users with small variations that change the exact bit sequences of the video files. For example, videos may be uploaded from different sources and may vary slightly in how they are compressed and decompressed. Further, different videos may have different source resolutions, start and stop times, frame rates, and so on, any of which will change the exact bit sequence of the file, and thereby prevent them from being identified as a copy of an existing file.
Other attempts to solve the described problems have involved applying techniques related to finding duplicate images. In these techniques individual frames of the video are treated as separate and independent images. Image transforms are performed to extract information representing spatial characteristics of the images that are then compared. However, there are two main weaknesses in this technique when trying to handle video. First, video typically contains an enormous number of image frames. A library may easily contain thousands or millions of videos, each having frame rates of 15 to 30 frames per second or more, and each averaging several minutes in length. Second, directly applying image matching techniques to video ignores important sequential information present in video. This time information is extremely valuable in both improving detection of duplicates and reducing the amount of data that needs to be processed to a manageable quantity, but is presently ignored by most techniques.
In view of the problems described above, an improved technique is needed for finding similarities between videos and detecting duplicate content based on the perceived visual content of the video. In addition, a technique is needed for comparing videos that is unaffected by small differences in compression factors, source resolutions, start and stop times, frame rates, and so on. Furthermore, the technique should be able to compare and match videos automatically without relying on manual classification by humans.
SUMMARY
A method and system generates and compares fingerprints for videos in a video library using fingerprints that represent spatial information within certain frames of the video, as well as sequential information between frames. The methods for generating video fingerprints provide a compact representation of the spatial and sequential characteristics that can be used to quickly and efficiently identify video content. The methods also allow for comparing videos by using their fingerprints in order to find a particular video with matching content (such as, for example, to find and remove copyright protected videos or to find and remove duplicates). In addition, the methods enable organizing and/or indexing a video library based on their visual content by using video fingerprints. This can provide improved display of search results by grouping videos with matching content.
A video fingerprint is generated by applying a three-dimensional transform to a video segment. The video fingerprint represents both the spatial characteristics within the frames of the video segment and sequential characteristics between frames; the transform is said to be three-dimensional because the spatial information within frames provides two dimensions of information, while the sequential information provide the third dimension of temporal information. Furthermore, because the fingerprint is based on the spatial and sequential characteristics of the video segment rather than an exact bit sequence, video content can be effectively compared even when videos have variations in compression factors, source resolutions, start and stop times, frame rates, and so on. A set of fingerprints associated with each segment of a video provide a fingerprint sequence for the video.
The set of video fingerprints for a received video can be compared against reference fingerprints for videos stored in a reference database. In this manner, matching videos can be efficiently located. This is useful for at least two reasons. First, when a video is uploaded to a file sharing site, it may be immediately checked against all videos in the library. If matches are found, the video can be properly indexed in order to eliminate presentation of duplicates in search results. Alternatively, it may be desirable to discard the uploaded video if any matches are found and only accept new entries to the library that are unique. Second, if a video is known to be copyright protected, its fingerprint can be used to efficiently search for visually identical videos in the library so that copyrighted material can be removed.
A system for detecting duplicate video content includes an ingest server, a fingerprinting module, an indexing module, a matching module, and a reference database. The ingest server receives an input video from a video source and provides the video to the fingerprinting module, which generates a fingerprint sequence for the ingest video. Each fingerprint in the fingerprint sequence is indexed by the indexing module according to one or more hash processes which selectively reduce the dimensionality of the fingerprint data. A matching module compares fingerprints and/or fingerprint sequences in the reference database to the fingerprint sequence associated with the ingest video and determines if a match is found. The matching module may be used both to locate particular video content from a query and to organize video search results based on their content.
Fingerprints can be generated using various techniques provided that each fingerprint is based upon the intra-frame spatial and inter-frame sequential (temporal) characteristics of the video. In one described embodiment, a system for generating a video fingerprint sequence includes a normalization module, a segmenting module, a transform module, and a quantization module. The normalization module converts received videos to a standard format for fingerprinting. The segmenting module segments the normalized video into a number of segments, each segment including a number of frames. Each segment of frames is separately transformed by the transform module in the horizontal, vertical, and time dimensions. This three-dimensional transform computes frequency information about edge differences in the spatial and temporal dimensions. The result is a three-dimensional array of coefficients that will be unique to the spatial and sequential characteristics of the group of frames. A Haar wavelet transform provides one example of a transform that can be used for this purpose; various other transforms may also be utilized. A quantizing module quantizes the three-dimensionally transformed segment in order to reduce the amount of data while still preserving the spatial and sequential characteristics of the video. The quantized transform results provide a video fingerprint for each video segment. A fingerprint sequence for the video is formed from the ordered set of fingerprints of the video segments.
The features and advantages described in the specification are not all inclusive and, in particular, many additional features and advantages will be apparent to one of ordinary skill in the art in view of the drawings, specification, and claims. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter.
BRIEF DESCRIPTION OF THE FIGURES
FIG. 1 is a high level block diagram illustrating a system for comparing video content in video library.
FIG. 2 is a block diagram illustrating an architecture for generating a video fingerprint.
FIG. 3 is a diagram illustrating a video structure as a series of frames.
FIG. 4 is a flowchart illustrating a process for generating a video fingerprint.
FIG. 5 is a diagram illustrating a technique for segmenting a video into overlapping segments.
FIG. 6 is a flowchart illustrating a process for computing a transform used in generating a video fingerprint.
FIG. 7 is a diagram illustrating computation of a transform used in generating a video fingerprint.
FIG. 8 is a flowchart illustrating a process for indexing video fingerprints.
FIG. 9 illustrates an example of indexed video segments.
FIG. 10 is a flowchart illustrating a process for matching video fingerprints.
The figures depict various embodiments of the present invention for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the invention described herein.
DETAILED DESCRIPTION
FIG. 1 is a high-level block diagram illustrating a system for comparing video content. The system comprises an ingest server 104, a fingerprinting module 106, an indexing module 108, a matching module 110, a reference database 112, and a video library 116. In alternative configurations, different or additional modules may be used.
The ingest server 104 is adapted to receive one or more videos from a video source 102. The video source 102 can be, for example, a client computer coupled to the ingest server 104 through a network. In this configuration, a user can upload video content to the ingest server 104 from a remote location. Alternatively, the video source 102 can be a database or other storage device coupled to the ingest server 104. For example, the video source 102 can be a video storage medium such as a DVD, CD-ROM, Digital Video Recorder (DVR), hard drive, Flash memory, or other memory. The ingest server 104 may also be coupled directly to a video capture system such as a video camera.
The ingest server 104 stores the received videos to the video library 116. The ingest server 104 can also pass received videos directly to the fingerprinting module 106 for fingerprinting immediately upon receipt. The ingest server 104 pre-processes the received video to convert it to a standard format for storage in the video library 116. For example, the ingest server 104 can convert the frame rate, frame size, and color depth of a received video to predetermined formats. For example, storage format can be Adobe FLASH®, with a frame size of 320×240 at 15 fps, and 8 bit color.
The fingerprinting module 106 receives a video from the ingest server 104 or from the video library 116 and generates a sequence of fingerprints associated with the video. Typically, the fingerprint module 106 divides the received video into multiple overlapping segments with each segment comprising a number of video frames, and a fingerprint is separately generated for each segment. Each fingerprint compactly represents spatial information within the group of video frames in the video segment and sequential characteristics between frames of the video segment. The fingerprint uniquely identifies a video segment based on its visual content such that minor variations due to compression, de-compression, noise, frame rate, start and stop time, source resolutions and so on do not significantly affect the fingerprint generated for the video segment. The complete ordered set of video fingerprints for the segments of a video provides a fingerprint sequence for the video.
The indexing module 108 receives the video fingerprint sequences for each video from fingerprinting module 106 and indexes the fingerprint sequences into the reference database 112. The indexing process can use a variety of different hash techniques to generate a signature for a fingerprint that uniquely identifies the fingerprint while fixing the size of the fingerprint data. The signature is broken into signature blocks and indexed in hash tables. Indexing beneficially reduces the number of bit comparisons needed to compare two fingerprints. Thus, searches for matching fingerprints can be accelerated relative to direct bit-for-bit comparisons of fingerprints.
The matching module 110 compares videos or video segments and generates a matching score indicating the likelihood of a match. The matching module 110 compares the fingerprint sequence of an ingest video to reference fingerprint sequences stored in the reference database 112. Alternatively, the matching module 110 compares fingerprint sequences in the reference database 112 corresponding to two or more videos stored in video library 116. The matching module 110 may further receive a search query from a user requesting particular content and output a video 118 from the video library 116 that matches the query 114.
The video library 116 is a storage device for storing a library of videos. The video library 116 may be any device capable of storing data, such as, for example, a file server, a hard drive, a writeable compact disk (CD) or DVD, or a solid-state memory device. Videos in the video library 116 are generally received from the ingest server 104 and can be outputted to the fingerprinting module 106 for fingerprinting. Videos are also outputted 118 by the matching module 110 that are relevant to a search query 114.
The reference database 112 stores the indexed fingerprints for each video in the video library 116. Each entry in the reference database 112 corresponds to signature blocks generated in the indexing process. Each entry is mapped to unique identifiers of the video segments corresponding to each signature block. The reference database 112 can be searched by the matching module 110 to quickly compare fingerprints and/or fingerprint sequences.
The described system can implement several usage scenarios. A first scenario enables the system to query-by-video to find identical or similar videos to a selected video. Here, a system operator provides an input query 114 to the matching module 110. The input query 114 is in the form of a video having particular content of interest such as, for example, video content that is copyright protected. A fingerprint sequence is generated for the copyright protected video and the reference database 112 is searched for matching fingerprints. Unauthorized copies can then be removed from the video library 116 (or otherwise processed) if the matching module 110 detects a match. In addition, new uploads can be automatically screened for unauthorized copies of known copyrighted works. Here, a newly uploaded video is fingerprinted and the fingerprint sequence is compared against fingerprint sequences for the known copyrighted videos. Then, matching uploads are blocked from storage in the video library 116. In one embodiment, the video can be processed in pieces as it is received so that so that the full video need not be received before processing begins.
In a second scenario, the system is used to detect and remove multiple copies of video content from the video library 116. Duplicate or near duplicate videos may be found within the video library 116, or new videos uploaded by the ingest server 104 may be automatically compared against videos in the video library 116. Duplicate videos found in the video library 116 are removed in order to save storage space. In one embodiment, if a new video is received that already has a duplicate in the video library 116, the new video is simply discarded.
In another scenario, the system can be used to provide organized search results of videos. In this scenario, a user provides an input query 114 and the matching module 110 returns relevant video results. The input query 114 can be in the form a conventional text-based search query or can be in the form of a video file as described previously. Using their fingerprint sequences, video results are compared to one another by the matching module 110 and matching videos are grouped together in the search results.
Referring now to FIG. 2, an embodiment of a fingerprinting module 106 for generating fingerprints of a received video is illustrated. The fingerprinting module 106 is adapted to receive an input video that has been pre-processed by the ingest server 104, and generate one or more fingerprints representing spatial and sequential characteristics associated with the video. The fingerprinting module 106 comprises a normalization module, 210 a segmenting module 220, a transform module 230, and a quantization module 240. In alternative configurations, the fingerprinting module 106 can have additional or different modules than those illustrated.
An example structure for a video received by the fingerprinting module 106 is provided in FIG. 3. The video comprises a series of frames 300. Each frame 300 comprises an image having a plurality of pixels arranged in a two-dimensional grid (for example, in an X direction and a Y direction). The frames 300 are also arranged sequentially in time (the t direction). Thus, a video comprises both spatial information, defined by the arrangement of pixels in the X and Y directions, and sequential or temporal information defined by how the pixels change throughout the time (t) dimension.
Turning back to FIG. 2, the normalization module 210 generally standardizes the data to be processed during fingerprinting. The normalization module 210 includes a frame rate converter 212, a frame size converter 214 and color converter 216 to normalize video to a predetermined format for fingerprinting. Converting video to a standardized fingerprint format ensures that videos are consistent and can produce comparable results. Often, frame rate, frame size, and color information are reduced by the normalization module 210 in order to improve the speed and power efficiency of the fingerprinting process. For example, the normalization module 210 can convert the video to luminance (grayscale) values without color, reduce the frame rate to 15 fps, and reduce the frame size to 64×64. To simplify computation, the number of pixels in each row and column of the frame size is preferably a power of 2 (e.g., 64×64) but any frame size is possible. Each of the standard formats used by the normalization module 210 may be predetermined or may be determined dynamically based on various constraints such as, for example, available power, available bandwidth, or characteristics of the received video.
The segmenting module 220 receives the normalized video from the normalization module 210 and divides the video into a number of segments with each segment including a number of frames. The segments may be stored, for example, in temporary buffers and outputted separately to the transform module 230. The segments preferably overlap by some number of frames.
The transform module 230 operates on the video segments obtained from the segmenting module 220. The transform module 230 comprises a row transformer 232, a column transformer 234, and a time column transformer 236 for performing a three-dimensional transform on each video segment. This three-dimensional transform computes frequency information about edge differences in two spatial dimensions and one temporal dimension. Because the transform results are based on the spatial and sequential characteristics rather than an exact bit sequence, the fingerprint can identify a video segment based on its content even in the presence of variations in compression factors, source resolutions, start and stop times, frame rates, and so on. The output of the transform module 230 is a three-dimensional array of coefficients that will be unique to the spatial and sequential characteristics of the group of frames in each video segment.
A quantization module 240 quantizes the three-dimensionally transformed segment in order to standardize the data size while still preserving the spatial and sequential characteristics of the video. Additionally, the quantization module 240 encodes and flattens the transformed coefficient array to a one-dimensional bit vector. The one-dimensional bit vector provides a fingerprint for an individual video segment.
Referring now to FIG. 4, a flowchart illustrates a process for generating a fingerprint sequence using the fingerprinting module 106. An input video is received by the fingerprinting module 106 and normalized 402 by the normalization module 210. Here, the frame size converter 214 scales down the size of the received video frames. The converted frames have a fixed number of pixels across the row and column of each frame. The color converter 216 generally reduces the color information in the video for example by converting to a grayscale such that each pixel is represented by a single luminance value.
The segmenting module 220 separates 404 the normalized video into a number of segments of consecutive video frames that may be overlapping. In one embodiment, each video segment includes the same number of video frames, typically, the number being equal to 2n where, n is an integer. Furthermore, the segments of video frames preferably overlap by a fixed number of frames. For example, referring now to FIG. 5, an axis is illustrated representing the numbered sequence of frames in a video file, with three segments S1, S2, and S3, each having 64 frames, and having 16 frames between the start of each segment. A first segment S1 is illustrated comprising frames 0-63, a second segment S2 comprises frames 16-79 and a third segment S3 comprises frames 32-95. Additional segments may be similarly obtained from the video. According to various other embodiments, a video segment may comprise a different number of frames and segments may overlap by any number of frames. Furthermore, a video may be segmented into segments of varying length or varying overlap.
Referring again to FIG. 4, the transform module 230 transforms 406 the video segment by applying a three-dimensional transform to the group of frames in the video segment. A transform is applied to each row, column, and time column for a video segment by the row transformer 232, column transformer 234, and time column transformer 236 respectively. Here, a row refers to a set of pixels aligned in the horizontal (X) direction of a video frame and a column refers to a set of pixels aligned in a vertical direction (Y) of a video frame. A time column refers to a set of pixels having the same horizontal and vertical location within a frame, but belonging to different frames (Z direction). In one embodiment, the row transformer, column transformer, and time column transformer apply identical mathematical functions but operate on different dimensions of the received video segment.
In one embodiment, the row transformer 232, column transformer 234, and time column transformer 236 each apply a Haar wavelet transform across their respective dimensions. In alternative embodiments, different types of transforms may be used such as, for example, a Gabor transform, or other related transform. FIG. 6 illustrates an example process for transforming a row of the video segment by applying a Haar wavelet transform. It is noted that different processes other than the process illustrated can be used to compute a Haar wavelet transform. The process in FIG. 6 can be viewed in conjunction with FIG. 7 which graphically illustrates the intermediate results of the transform at various stages of the process. For the purpose of illustration, the Haar wavelet transform is conceptually illustrated in FIG. 7 for a row of 8 pixels. It will be apparent to one of ordinary skill that described technique can be extended to any size row. Furthermore, the transform is not limited to a row, but can be similarly applied to any column or time column.
A row of pixels is received 602 by the row transformer 232. In the example illustration of FIG. 7, the row 702 comprises 8 pixels having values A-H. The values A-H may represent, for example, the luminance value of the pixels or some other characteristic of the pixels such as color information. A new row 704 is created 604 with a length equal to the length of the original row 702. The new row 704 may be, for example, a temporary buffer in the row transformer 232 and does not correspond to an actual row of pixels in the video segment. Pixels in the previous row (i.e. the original row 702) are grouped 606 into pairs, with each pair comprising two adjacent pixels. For example pixel A and B form a first pair 712, pixels C and D form a second pair, and so on. Values in the first section of the new row (e.g., the left half) are set 608 to be the sums of each pair in the previous row. For example, the first entry is set to (A+B), the second entry set to (C+D), and so on for the first four elements of row 704. Values in the second section (e.g., the right half) are set 610 to the differences of each pair in the previous row 702. For example, the 5th entry is set to (A−B), the 6th entry is set to (C−D), and so on. In step 612, all or some of the entries may be scaled by a constant value. Scaling by one-half in the summed entries, for example, will provide an average of the values.
In step 614, the next row 706 is created. In the new row, all values are copied 616 to from the values in the previous row 704 except for values in the 1st section. Thus, entries 5-8 in row 706 are equivalent to entries 608 in row 704. The first section of the new row 706 is divided 618 into a new first and second section of equal size. The process then repeats 620 back to step 606 and iterates until the first section is a single entry and cannot be divided any further. The final values of the entries are illustrated in row 708. The final row 708 then overwrites 622 the original row 702. In this way, when the transforms for the other dimensions are applied (column, and time column), they are applied to the results of the transform in the previous dimension (not the original pixels). It is noted that in different embodiments, transforms may be applied to the dimensions in any order. Furthermore, in alternate variations, the sums and/or differences of pairs can instead include some other aggregate function, such as an average function.
The resulting values in 708 provide information relating to edge differences in the original row of pixels 702. As can be seen, the first value of 708 represents a sum of all the pixels in the original row 702. The second value represents the difference between the sum of values in the first half and the sum of values in the second half of original row 702. The third value represent the difference between the first quarter and second quarter, the fourth value represents the difference between the third quarter and fourth quarter, and so on. These values provide edge information since edges correspond to differences in luminance value, at varying frequencies with the first entry corresponding to the lowest frequency edges and the last entries corresponding to the highest frequencies. Note that some of the values will be positive, some will be negative, and many will be close or equal to zero.
Alternate techniques can be used to compute the Haar wavelet transform. For example techniques using boxlets, summed-area tables, or integral images may be utilized. In one technique, partial sums are first formed across the original row of pixels. In the partial sum, the value stored at a particular pixel location is equal to the sum of that pixel's luminance value plus the luminance values of all previous pixels. It is observed that the values in 708 are the differences between the sums of adjacent ranges of the original pixels. Then, the entries in the final result 708 can be computed directly by the differences of two partial sums.
The transform process of FIGS. 6 and 7 is repeated for each row, column, and time column in the video segment by the respective transform modules. The result in a three-dimensional array of coefficients that represents the spatial and sequential characteristics of all frames in the segment, and which is outputted by the transform module 230.
Referring again to FIG. 4, the quantization module 240 quantizes 408 the three-dimensionally transformed segment. Various quantization techniques are possible. For example, in one quantization process, the quantization module 240 determines the N coefficients with the largest absolute values; N may be a predetermined number or may be determined dynamically based on various constraints. The quantization module 240 quantizes the N coefficients to +1 or −1 by preserving the signs of the coefficients and sets the remaining coefficients to zero. In a second example, coefficients are quantized by comparing the magnitude of each coefficient to a predetermined threshold value. Any coefficient with a magnitude greater than the threshold value is quantized to +1 or −1 by preserving its sign, and the remaining coefficients are set to zero. In a third example quantization process, constraints are placed on both the number of coefficient and their magnitudes. In this process, the quantization module 240 quantizes only the N greatest coefficients that have a magnitude greater than a threshold value to +1 or −1, and sets the remaining coefficients to zero.
As part of the quantizing process, the quantization module 240 encodes 410 the three-dimensional coefficient array and flattens the array to a one-dimensional bit vector. If, for example, each bit is quantized to +1, −1, or 0 a two-bit encoding scheme uses the bits 10 for +1, 01 for −1, and 00 for zero. Various other encoding techniques are possible without departing from the scope of the invention. The output of the quantization module 240 is a quantized and encoded bit vector that forms a fingerprint for a single video segment.
The fingerprinting process then repeats 412 for each video segment in the video. The ordered set of video fingerprints generated by the process forms a fingerprint sequence for the entire video file. A fingerprint sequence can be compared to a reference fingerprint sequence by counting the number of differences between the bits in the respective sequences. This comparison provides a good indication of the similarity between the videos associated with the fingerprint sequences.
In order to reduce the number of bit comparisons when comparing fingerprints to each other, the fingerprints are indexed by the indexing module 108. An example process for indexing uses a min-hash process as illustrated in FIG. 8. The min-hash process generates a “signature” for the video fingerprint by applying a set of P permutations to the bit values of the fingerprint. The signature contains fewer bits than the full fingerprint but retains most of the information in the associated fingerprint.
As described above, the video fingerprint is in the form of a bit vector that represents the flattened quantized three-dimensional transform results for an individual segment. Generally, the indexing module applies a number P permutations to the bits of the fingerprint. Each permutation defines a bit re-arrangement (e.g., bit swap) of the bits of the fingerprint; the permutation may be a random permutation or algorithmic. The permutations are preferably defined beforehand, but once defined the permutations are fixed and always applied in the same order.
Referring to FIG. 8, the indexing module 108 receives 802 a fingerprint for a video segment. A new bit vector is generated 806 by re-arranging the bits according to a first permutation P1. A scanning module scans 808 for the location of the first bit value of “1” in the re-arranged bit vector and records 810 this location to a location vector. This process of permutation and location recording repeats 814 for all P permutations. Thus, each received fingerprint will have the same set of P permutations applied in the same order. The output is a location vector having P values, with each value indicating a location of the first bit value of “1” in the underlying fingerprint after applying each permutation. This set of locations provides the signature for the fingerprint.
The min-hash process described above can be further combined with locality sensitive hashing. In locality sensitive hashing, each signature is divided into a number of signature blocks and each signature block is placed into a different hash table. For each entry in the hash tables, unique identifiers of any video segment that generates that particular signature block are stored with the corresponding signature block.
FIG. 9 illustrates an example of indexed fingerprints using the min-hash and locality sensitive hashing techniques described above. Two videos, VID 4 comprising 100 segments and VID 7 comprising 365 segments are shown. A first signature 902 a corresponds to a first fingerprint of the second video segment of VID 4. The signature 902 a is represented by a sequence of P locations (e.g., 11, 32, 11, 18 . . . ). The signature is broken into signature blocks 906 of four locations each. According to various embodiments, different sized signature blocks are used. A second signature 902 b corresponds to the third video segment of VID 7. The first signature block in each signature 902 is mapped to table 1, the second signature block is mapped to table 2, and so on. The tables store each signature block and a unique identifier for all video segments that generated each particular signature block. The tables also associate an index number with each unique signature block representing an offset into the table, although the index number itself need not be explicitly stored. For example, table 1, index 1 corresponds to the signature block having the sequence 11, 32, 11, 18. In table 2, the signature block stored at index 1 corresponds to the sequence (563, 398, 13, 6). Because both VID 4 segment 2, and VID 7, segment 3 have this sequence as their second signature block, both segments are mapped to index 1. In practice, each video segment can be assigned a unique identifier, which is used in these tables in place of the tuple (video, segment).
Once each video segment's signature is indexed for a collection of videos, the matching module 110 can be used to efficiently compare and match fingerprints of video files. Using the hash techniques described above, videos can be compared simply by comparing the index values of their signature blocks for each segment, rather than performing a bit-for-bit comparison of the entire fingerprint sequence. An example matching process is illustrated in FIG. 10.
In step 1002, a signature sequence (corresponding to the ordered signatures of a fingerprint sequence) for an ingested video is received by the matching module 110. Each signature block of a first signature in the signature sequence is hashed 1004 into the corresponding hash tables. For every matching signature block found in the table, a separate count is incremented for each unique video identifier associated with the matching signature block. Thus, each reference fingerprint maintains a separate count indicating the number of signature blocks of the reference fingerprint that match signature blocks of the first fingerprint of the ingest video. The counts are used to determine 1006 a matching score between the first fingerprint of the ingest video and each reference fingerprint of each video segment in the reference database 112. The matching scores are compared against a threshold value to determine 1008 all reference fingerprints having matching scores above the threshold. Reference fingerprints with matching scores above the threshold are designated as matching fingerprints. This process then repeats for each individual fingerprint of the fingerprint sequence of the ingest video.
Once all matching fingerprints are found for the fingerprints of the ingest video, matching is performed at the sequence level. In one method, the matching module 110 determines the reference video with the longest consecutive sequence of fingerprints that match the fingerprint sequence of the ingest video. Because each fingerprint corresponds to a time segment of video, this method determines a reference video that matches the ingest video over the longest consecutive time period.
Alternative sequence matching methods may also be used. In another example method, a fixed length window of time (e.g., 15 seconds) is designated for the ingest video. The fixed length window of time corresponds to a block of fingerprints in the fingerprint sequence of the ingest video. For a reference video having matching segments, time offsets are determined between each matching segment of the reference video and the corresponding segments of the ingest video. Each matching pair of segments casts a “vote” for a particular time offset. The votes are counted across all matching pairs and the reference window with the highest number of votes is designated as the best match.
The systems and methods described above enable indexing a video library using video fingerprints and matching video content based on spatial and sequential characteristics of the video. This is particularly useful, for example, in finding and removing duplicate video content and preventing sharing of copyright protected content. Moreover, the methods can be performed automatically and are therefore more efficient and cost effective than conventional techniques. The present invention has been described in particular detail with respect to a limited number of embodiments. Those of skill in the art will appreciate that the invention may additionally be practiced in other embodiments. First, the particular naming of the components, capitalization of terms, the attributes, data structures, or any other programming or structural aspect is not mandatory or significant, and the mechanisms that implement the invention or its features may have different names, formats, or protocols. Furthermore, the system may be implemented via a combination of hardware and software, as described, or entirely in hardware elements. Also, the particular division of functionality between the various system components described herein is merely exemplary, and not mandatory; functions performed by a single system component may instead be performed by multiple components, and functions performed by multiple components may instead performed by a single component.
Some portions of the above description present the feature of the present invention in terms of algorithms and symbolic representations of operations on information. These algorithmic descriptions and representations are the means used by those skilled in the art to most effectively convey the substance of their work to others skilled in the art. These operations, while described functionally or logically, are understood to be implemented by computer programs. Furthermore, it has also proven convenient at times, to refer to these arrangements of operations as modules or code devices, without loss of generality.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the present discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system memories or registers or other such information storage, transmission or display devices.
Certain aspects of the present invention include process steps and instructions described herein in the form of an algorithm. It should be noted that the process steps and instructions of the present invention could be embodied in software, firmware or hardware, and when embodied in software, could be downloaded to reside on and be operated from different platforms used by real time network operating systems.
The present invention also relates to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus. Furthermore, the computers referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may also be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will appear from the description above. In addition, the present invention is not described with reference to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any references to specific languages are provided for disclosure of enablement and best mode of the present invention.
Finally, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, the disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention.

Claims (21)

1. A method for fingerprinting a video, comprising:
receiving the video;
segmenting the video into a plurality of video segments including a first video segment, the first video segment having a plurality of video frames;
performing a three-dimensional transform on the plurality of video frames in the first video segment to generate a three-dimensional wavelet, wherein the three-dimensional wavelet represents spatial information within the video frames and sequential characteristics between the video frames;
quantizing the three-dimensional wavelet to generate a first video fingerprint;
storing a video fingerprint sequence to a non-transitory computer readable storage medium, wherein each video fingerprint in the video fingerprint sequence is associated with a different video segment in the plurality of video segments;
comparing the video fingerprint sequence to a reference fingerprint sequence; and
determining a similarity between the video fingerprint sequence and the reference fingerprint sequence based on the comparison.
2. The method of claim 1, wherein performing the three-dimensional transform on the plurality of frames comprises performing a Haar wavelet transform on each row, column, and time column of the plurality of frames.
3. The method of claim 1, further comprising:
encoding and flattening the three-dimensional wavelet to a one-dimensional structure.
4. The method of claim 1, further comprising normalizing the plurality of frames by converting at least one of frame size, frame rate, and color information to a standard format.
5. The method of claim 1, wherein a second video segment of the plurality of video segments overlaps the first video segment by one or more frames.
6. The method of claim 1, wherein quantizing the three-dimensional wavelet comprises:
selecting a subset of N coefficients of the three-dimensional wavelet, where N is an integer; and
setting negative coefficients of the N coefficients to a first fixed value;
setting positive coefficients of the N coefficients to a second fixed value;
setting remaining coefficients of the three-dimensional wavelet to a third fixed value, wherein the remaining coefficients are not among the N coefficients.
7. The method of claim 6, wherein the first fixed value represents −1, the second fixed value represents +1, and the third fixed value represents zero.
8. The method of claim 6, wherein selecting the N coefficients comprises:
selecting N largest magnitude coefficients of the three-dimensional wavelet.
9. The method of claim 6, wherein selecting the N coefficients comprises:
selecting N coefficients having a magnitude greater than a threshold magnitude.
10. The method of claim 1, wherein storing the video fingerprint comprises:
indexing the video fingerprint to an index storing an association between the video fingerprint and an identifier of the first video segment.
11. A computer system for generating a video fingerprint comprising:
one or more processors; and
a non-transitory computer readable storage medium storing computer-executable program modules executable by the one or more processors, the computer-executable program modules comprising: an input module adapted to receive a video; a segmenting module adapted to segment the video into a plurality of video segments including a first video segment, each segment including at least two frames;
a transform module adapted to perform a three-dimensional transform on the at least two frames in the first video segment to generate a three-dimensional wavelet, wherein the three dimensional wavelet represents spatial characteristics and sequential characteristics of the at least two frames associated with the first video segment;
a quantizing module adapted to quantize the three-dimensional wavelet to generate a first video fingerprint;
an output module adapted to store a video fingerprint sequence to a non-transitory computer readable storage medium, wherein each video fingerprint in the video fingerprint sequence is associated with a different video segment in the plurality of video segments; and
a comparison module adapted to compare the video fingerprint sequence to a reference fingerprint sequence and determine a similarity between the video fingerprint sequence and the reference fingerprint sequence based on the comparison.
12. The computer system of claim 11, further comprising a normalization module adapted to normalize the received video to a predetermined format.
13. The computer system of claim 11, wherein performing the three-dimensional transform on the first video segment comprises performing a Haar wavelet transform on each row, column, and time column of the first video segment.
14. The computer system of claim 11, wherein the quantization module is further configured to encode and flatten the three-dimensional wavelet to a one-dimensional structure.
15. The computer system of claim 11, the computer readable storage medium further storing program instructions for normalizing the first video segment by converting at least one of frame size, frame rate, and color information to a standard format.
16. The computer system of claim 11, wherein a second video segment of the plurality of video segments overlaps the first video segment by one or more frames.
17. A non-transitory computer-readable storage medium storing instructions for fingerprinting a video, the instructions when executed by a processor cause the processor to perform steps including:
receiving the video;
segmenting the video into a plurality of video segments including a first video segment, the first video segment having a plurality of video frames;
performing a three-dimensional transform on the plurality of video frames in the first video segment to generate a three-dimensional wavelet, wherein the three-dimensional wavelet represents spatial information within the video frames and sequential characteristics between the video frames;
quantizing the three-dimensional wavelet to generate a first video fingerprint;
storing a video fingerprint sequence to a non-transitory computer readable storage medium, wherein each video fingerprint in the video fingerprint sequence is associated with a different video segment in the plurality of video segments;
comparing the video fingerprint sequence to a reference fingerprint sequence; and
determining a similarity between the video fingerprint sequence and the reference fingerprint sequence based on the comparison.
18. The non-transitory computer-readable storage medium of claim 17, wherein performing the three-dimensional transform on the plurality of frames comprises performing a Haar wavelet transform on each row, column, and time column of the plurality of frames.
19. The non-transitory computer-readable storage medium of claim 17, further comprising: encoding and flattening the three-dimensional wavelet to a one-dimensional structure.
20. The non-transitory computer-readable storage medium of claim 17, the instructions when executed further causing the processor to normalize the plurality of frames by converting at least one of frame size, frame rate, and color information to a standard format.
21. The non-transitory computer-readable storage medium of claim 17, wherein a second video segment of the plurality of video segments overlaps the first video segment by one or more frames.
US11/746,339 2007-05-09 2007-05-09 Three-dimensional wavelet based video fingerprinting Active 2030-06-01 US8094872B1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US11/746,339 US8094872B1 (en) 2007-05-09 2007-05-09 Three-dimensional wavelet based video fingerprinting
US12/968,825 US8611689B1 (en) 2007-05-09 2010-12-15 Three-dimensional wavelet based video fingerprinting
US13/250,494 US8340449B1 (en) 2007-05-09 2011-09-30 Three-dimensional wavelet based video fingerprinting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/746,339 US8094872B1 (en) 2007-05-09 2007-05-09 Three-dimensional wavelet based video fingerprinting

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US12/968,825 Division US8611689B1 (en) 2007-05-09 2010-12-15 Three-dimensional wavelet based video fingerprinting
US13/250,494 Continuation US8340449B1 (en) 2007-05-09 2011-09-30 Three-dimensional wavelet based video fingerprinting

Publications (1)

Publication Number Publication Date
US8094872B1 true US8094872B1 (en) 2012-01-10

Family

ID=45419176

Family Applications (3)

Application Number Title Priority Date Filing Date
US11/746,339 Active 2030-06-01 US8094872B1 (en) 2007-05-09 2007-05-09 Three-dimensional wavelet based video fingerprinting
US12/968,825 Active US8611689B1 (en) 2007-05-09 2010-12-15 Three-dimensional wavelet based video fingerprinting
US13/250,494 Active US8340449B1 (en) 2007-05-09 2011-09-30 Three-dimensional wavelet based video fingerprinting

Family Applications After (2)

Application Number Title Priority Date Filing Date
US12/968,825 Active US8611689B1 (en) 2007-05-09 2010-12-15 Three-dimensional wavelet based video fingerprinting
US13/250,494 Active US8340449B1 (en) 2007-05-09 2011-09-30 Three-dimensional wavelet based video fingerprinting

Country Status (1)

Country Link
US (3) US8094872B1 (en)

Cited By (81)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080109369A1 (en) * 2006-11-03 2008-05-08 Yi-Ling Su Content Management System
US20080275763A1 (en) * 2007-05-03 2008-11-06 Thai Tran Monetization of Digital Content Contributions
US20090013189A1 (en) * 2007-06-28 2009-01-08 Michel Morvan Method and devices for video processing rights enforcement
US20090238465A1 (en) * 2008-03-18 2009-09-24 Electronics And Telecommunications Research Institute Apparatus and method for extracting features of video, and system and method for identifying videos using same
US20090276468A1 (en) * 2008-05-01 2009-11-05 Yahoo! Inc. Method for media fingerprinting
US20090313249A1 (en) * 2008-06-11 2009-12-17 Bennett James D Creative work registry independent server
US20100049711A1 (en) * 2008-08-20 2010-02-25 Gajinder Singh Content-based matching of videos using local spatio-temporal fingerprints
US20100174930A1 (en) * 2009-01-05 2010-07-08 Samsung Electronics Co., Ltd. Mobile device having organic light emitting display and related display method for power saving
US20110004944A1 (en) * 2009-06-24 2011-01-06 Tvu Networks Corporation Methods and systems for fingerprint-based copyright protection of real-time content
US20110235908A1 (en) * 2010-03-23 2011-09-29 Microsoft Corporation Partition min-hash for partial-duplicate image determination
US20110311095A1 (en) * 2010-06-18 2011-12-22 Verizon Patent And Licensing, Inc. Content fingerprinting
US20120027306A1 (en) * 2009-03-13 2012-02-02 Nec Corporation Image signature extraction device
US20120062793A1 (en) * 2010-09-15 2012-03-15 Verizon Patent And Licensing Inc. Synchronizing videos
US20120102021A1 (en) * 2010-10-21 2012-04-26 International Business Machines Corporation Visual meme tracking for social media analysis
US20120224741A1 (en) * 2011-03-03 2012-09-06 Edwards Tyson Lavar Data pattern recognition and separation engine
US8340449B1 (en) * 2007-05-09 2012-12-25 Google Inc. Three-dimensional wavelet based video fingerprinting
US8447032B1 (en) * 2007-08-22 2013-05-21 Google Inc. Generation of min-hash signatures
US20130332951A1 (en) * 2009-09-14 2013-12-12 Tivo Inc. Multifunction multimedia device
US20140002749A1 (en) * 2012-06-28 2014-01-02 Mihai Pora Generating a Sequence of Audio Fingerprints at a Set Top Box
US8660296B1 (en) * 2012-01-10 2014-02-25 Google Inc. Systems and methods for facilitating video fingerprinting using local descriptors
US8712216B1 (en) * 2008-02-22 2014-04-29 Google Inc. Selection of hash lookup keys for efficient retrieval
US20140193027A1 (en) * 2013-01-07 2014-07-10 Steven D. Scherf Search and identification of video content
WO2014145929A1 (en) * 2013-03-15 2014-09-18 Zeev Neumeier Systems and methods for addressing a media database using distance associative hashing
US8842920B1 (en) 2012-03-29 2014-09-23 Google Inc. Systems and methods for facilitating crop-invariant video fingerprinting using wavelet transforms
US8843952B2 (en) 2012-06-28 2014-09-23 Google Inc. Determining TV program information based on analysis of audio fingerprints
WO2014152567A1 (en) * 2013-03-15 2014-09-25 Google Inc. Promoting an original version of a copyrighted media item over an authorized copied version of the copyrighted media item in a search query
WO2015010095A1 (en) * 2013-07-18 2015-01-22 Google Inc. Generating and providing an authorization indication in relation to a media content item
US8953836B1 (en) * 2012-01-31 2015-02-10 Google Inc. Real-time duplicate detection for uploaded videos
US8995771B2 (en) 2012-04-30 2015-03-31 Microsoft Technology Licensing, Llc Identification of duplicates within an image space
US20150143416A1 (en) * 2013-11-21 2015-05-21 Thomson Licensing Method and apparatus for matching of corresponding frames in multimedia streams
US9055335B2 (en) 2009-05-29 2015-06-09 Cognitive Networks, Inc. Systems and methods for addressing a media database using distance associative hashing
US9087260B1 (en) * 2012-01-03 2015-07-21 Google Inc. Hierarchical randomized quantization of multi-dimensional features
US9135674B1 (en) 2007-06-19 2015-09-15 Google Inc. Endpoint based video fingerprinting
US9280977B2 (en) 2009-05-21 2016-03-08 Digimarc Corporation Content recognition employing fingerprinting at different resolutions
US9336367B2 (en) 2006-11-03 2016-05-10 Google Inc. Site directed management of audio components of uploaded video files
US20160316261A1 (en) * 2015-04-23 2016-10-27 Sorenson Media, Inc. Automatic content recognition fingerprint sequence matching
EP3001871A4 (en) * 2013-03-15 2017-02-22 Cognitive Media Networks, Inc. Systems and methods for addressing a media database using distance associative hashing
US20170127095A1 (en) * 2014-06-13 2017-05-04 Samsung Electronics Co., Ltd. Method and device for managing multimedia data
US9661361B2 (en) 2012-09-19 2017-05-23 Google Inc. Systems and methods for live media content matching
US9723344B1 (en) * 2015-12-29 2017-08-01 Google Inc. Early detection of policy violating media
US9781377B2 (en) 2009-12-04 2017-10-03 Tivo Solutions Inc. Recording and playback system based on multimedia content fingerprints
US9813784B1 (en) * 2015-03-25 2017-11-07 A9.com Expanded previously on segments
US9838494B1 (en) * 2014-06-24 2017-12-05 Amazon Technologies, Inc. Reducing retrieval times for compressed objects
US9838753B2 (en) 2013-12-23 2017-12-05 Inscape Data, Inc. Monitoring individual viewing of television events using tracking pixels and cookies
US20170357875A1 (en) * 2016-06-08 2017-12-14 International Business Machines Corporation Detecting usage of copyrighted video content using object recognition
US20170371963A1 (en) * 2016-06-27 2017-12-28 Facebook, Inc. Systems and methods for identifying matching content
EP3264323A1 (en) * 2016-06-27 2018-01-03 Facebook, Inc. Systems and methods for identifying matching content
US9906834B2 (en) 2009-05-29 2018-02-27 Inscape Data, Inc. Methods for identifying video segments and displaying contextually targeted content on a connected television
US9905233B1 (en) 2014-08-07 2018-02-27 Digimarc Corporation Methods and apparatus for facilitating ambient content recognition using digital watermarks, and related arrangements
US9936230B1 (en) * 2017-05-10 2018-04-03 Google Llc Methods, systems, and media for transforming fingerprints to detect unauthorized media content items
US20180101540A1 (en) * 2016-10-10 2018-04-12 Facebook, Inc. Diversifying Media Search Results on Online Social Networks
US9955192B2 (en) 2013-12-23 2018-04-24 Inscape Data, Inc. Monitoring individual viewing of television events using tracking pixels and cookies
WO2018089094A1 (en) * 2016-11-11 2018-05-17 Google Llc Differential scoring: a high-precision scoring method for video matching
US20180176216A1 (en) * 2013-12-31 2018-06-21 Veridium Ip Limited System and method for biometric protocol standards
US10080062B2 (en) 2015-07-16 2018-09-18 Inscape Data, Inc. Optimizing media fingerprint retention to improve system resource utilization
US10116972B2 (en) 2009-05-29 2018-10-30 Inscape Data, Inc. Methods for identifying video segments and displaying option to view from an alternative source and/or on an alternative device
EP3404926A1 (en) * 2017-05-17 2018-11-21 Snell Advanced Media Limited Generation of visual hash
US10169455B2 (en) 2009-05-29 2019-01-01 Inscape Data, Inc. Systems and methods for addressing a media database using distance associative hashing
CN109255777A (en) * 2018-07-27 2019-01-22 昆明理工大学 A kind of image similarity calculation method of combination wavelet transformation and perceptual hash algorithm
US10192138B2 (en) 2010-05-27 2019-01-29 Inscape Data, Inc. Systems and methods for reducing data density in large datasets
US10198441B1 (en) 2014-01-14 2019-02-05 Google Llc Real-time duplicate detection of videos in a massive video sharing system
US10236005B2 (en) * 2017-06-08 2019-03-19 The Nielsen Company (Us), Llc Methods and apparatus for audio signature generation and matching
US10324598B2 (en) 2009-12-18 2019-06-18 Graphika, Inc. System and method for a search engine content filter
US10375451B2 (en) 2009-05-29 2019-08-06 Inscape Data, Inc. Detection of common media segments
US10405014B2 (en) 2015-01-30 2019-09-03 Inscape Data, Inc. Methods for identifying video segments and displaying option to view from an alternative source and/or on an alternative device
US10482349B2 (en) 2015-04-17 2019-11-19 Inscape Data, Inc. Systems and methods for reducing data density in large datasets
US10555023B1 (en) 2017-09-25 2020-02-04 Amazon Technologies, Inc. Personalized recap clips
US10713495B2 (en) 2018-03-13 2020-07-14 Adobe Inc. Video signatures based on image feature extraction
US10873788B2 (en) 2015-07-16 2020-12-22 Inscape Data, Inc. Detection of common media segments
US10902048B2 (en) 2015-07-16 2021-01-26 Inscape Data, Inc. Prediction of future views of video segments to optimize system resource utilization
US10911824B2 (en) * 2018-11-05 2021-02-02 The Nielsen Company (Us), Llc Methods and apparatus to generate reference signatures
US10949458B2 (en) 2009-05-29 2021-03-16 Inscape Data, Inc. System and method for improving work load management in ACR television monitoring system
US10956945B1 (en) 2014-02-24 2021-03-23 Google Llc Applying social interaction-based policies to digital media content
US10977307B2 (en) 2007-06-18 2021-04-13 Gracenote, Inc. Method and apparatus for multi-dimensional content search and video identification
US10977298B2 (en) 2013-11-08 2021-04-13 Friend for Media Limited Identifying media components
US10983984B2 (en) 2017-04-06 2021-04-20 Inscape Data, Inc. Systems and methods for improving accuracy of device maps using media viewing data
GB2527528B (en) * 2014-06-24 2021-09-29 Grass Valley Ltd Hash-based media search
US20210352341A1 (en) * 2020-05-06 2021-11-11 At&T Intellectual Property I, L.P. Scene cut-based time alignment of video streams
US11308144B2 (en) 2015-07-16 2022-04-19 Inscape Data, Inc. Systems and methods for partitioning search indexes for improved efficiency in identifying media segments
US11329980B2 (en) 2015-08-21 2022-05-10 Veridium Ip Limited System and method for biometric protocol standards
US11409825B2 (en) 2009-12-18 2022-08-09 Graphika Technologies, Inc. Methods and systems for identifying markers of coordinated activity in social media movements

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NL2012567B1 (en) 2014-04-04 2016-03-08 Teletrax B V Method and device for generating improved fingerprints.
US9685194B2 (en) 2014-07-23 2017-06-20 Gopro, Inc. Voice-based video tagging
US9886962B2 (en) * 2015-03-02 2018-02-06 Google Llc Extracting audio fingerprints in the compressed domain
US9872056B1 (en) 2016-12-16 2018-01-16 Google Inc. Methods, systems, and media for detecting abusive stereoscopic videos by generating fingerprints for multiple portions of a video frame
CN107317952B (en) * 2017-06-26 2020-12-29 广东德九新能源有限公司 Video image processing method and picture splicing method based on electronic map

Citations (70)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5241281A (en) * 1990-03-19 1993-08-31 Capetronic Group Ltd. Microprocessor controlled monitor
US5600373A (en) * 1994-01-14 1997-02-04 Houston Advanced Research Center Method and apparatus for video image compression and decompression using boundary-spline-wavelets
US5634012A (en) 1994-11-23 1997-05-27 Xerox Corporation System for controlling the distribution and use of digital works having a fee reporting mechanism
US5664018A (en) 1996-03-12 1997-09-02 Leighton; Frank Thomson Watermarking process resilient to collusion attacks
US5729662A (en) * 1995-06-07 1998-03-17 Rozmus; J. Michael Neural network for classification of patterns with improved method and apparatus for ordering vectors
US6005643A (en) * 1996-10-15 1999-12-21 International Business Machines Corporation Data hiding and extraction methods
US6226387B1 (en) * 1996-08-30 2001-05-01 Regents Of The University Of Minnesota Method and apparatus for scene-based video watermarking
US6407680B1 (en) 2000-12-22 2002-06-18 Generic Media, Inc. Distributed on-demand media transcoding system and method
US20020150164A1 (en) * 2000-06-30 2002-10-17 Boris Felts Encoding method for the compression of a video sequence
US20020165819A1 (en) 2001-05-02 2002-11-07 Gateway, Inc. System and method for providing distributed computing services
US20030061490A1 (en) 2001-09-26 2003-03-27 Abajian Aram Christian Method for identifying copyright infringement violations by fingerprint detection
US20030123584A1 (en) * 2001-12-28 2003-07-03 Siegel Erwin Frederick Trace video filtering using wavelet de-noising techniques
US20040028138A1 (en) * 2000-10-24 2004-02-12 Christopher Piche Three-dimensional wavelet-based scalable video compression
US6768518B1 (en) * 1998-12-16 2004-07-27 Xerox Corporation Method and apparatus for removing a checkerboard-like noise artifact from a captured composite NTSC video frame
US20040170335A1 (en) * 1995-09-14 2004-09-02 Pearlman William Abraham N-dimensional data compression using set partitioning in hierarchical trees
US20050008190A1 (en) * 1995-07-27 2005-01-13 Levy Kenneth L. Digital watermarking systems and methods
US6871200B2 (en) 2002-07-11 2005-03-22 Forensic Eye Ltd. Registration and monitoring system
US20050125845A1 (en) * 2003-12-08 2005-06-09 Hardt Charles R. Set-top software mechanism for insertion of a unique non-intrusive digital signature into video program content
US20050154892A1 (en) * 2004-01-09 2005-07-14 Mihcak Mehmet K. Systems and methods for embedding media forensic identification markings
US20050172312A1 (en) 2003-03-07 2005-08-04 Lienhart Rainer W. Detecting known video entities utilizing fingerprints
US20050213826A1 (en) * 2004-03-25 2005-09-29 Intel Corporation Fingerprinting digital video for rights management in networks
US6976165B1 (en) 1999-09-07 2005-12-13 Emc Corporation System and method for secure storage, transfer and retrieval of content addressable information
US20060085816A1 (en) 2004-10-18 2006-04-20 Funk James M Method and apparatus to control playback in a download-and-view video on demand system
US7039215B2 (en) * 2001-07-18 2006-05-02 Oki Electric Industry Co., Ltd. Watermark information embedment device and watermark information detection device
US7043473B1 (en) * 2000-11-22 2006-05-09 Widevine Technologies, Inc. Media tracking system and method
US20060098872A1 (en) 2004-03-22 2006-05-11 Stereo Display, Inc. Three-dimensional imaging system for pattern recognition
US7046855B2 (en) * 2001-10-29 2006-05-16 Parthusceva Ltd. Method and apparatus for performing spatial-to-frequency domain transform
US20060110005A1 (en) * 2004-11-01 2006-05-25 Sony United Kingdom Limited Encoding apparatus and method
US20060114998A1 (en) * 2002-12-04 2006-06-01 Eric Barrau Video coding method and device
US20060120558A1 (en) * 2004-10-20 2006-06-08 Yun-Qing Shi System and method for lossless data hiding using the integer wavelet transform
US20060187358A1 (en) * 2003-03-07 2006-08-24 Lienhart Rainer W Video entity recognition in compressed digital video streams
US20060195860A1 (en) 2005-02-25 2006-08-31 Eldering Charles A Acting on known video entities detected utilizing fingerprinting
US20060195859A1 (en) 2005-02-25 2006-08-31 Richard Konig Detecting known video entities taking into account regions of disinterest
US20060271947A1 (en) 2005-05-23 2006-11-30 Lienhart Rainer W Creating fingerprints
US20070005556A1 (en) 2005-06-30 2007-01-04 Microsoft Corporation Probabilistic techniques for detecting duplicate tuples
US7185200B1 (en) * 1999-09-02 2007-02-27 Microsoft Corporation Server-side watermark data writing method and apparatus for digital signals
US20070047816A1 (en) 2005-08-23 2007-03-01 Jamey Graham User Interface for Mixed Media Reality
US20070124756A1 (en) 2005-11-29 2007-05-31 Google Inc. Detecting Repeating Content in Broadcast Media
US20070124698A1 (en) 2005-11-15 2007-05-31 Microsoft Corporation Fast collaborative filtering through approximations
US20070156726A1 (en) 2005-12-21 2007-07-05 Levy Kenneth L Content Metadata Directory Services
US20070180537A1 (en) 2005-01-07 2007-08-02 Shan He Method for fingerprinting multimedia content
US20070230739A1 (en) * 1997-02-20 2007-10-04 Andrew Johnson Digital watermark systems and methods
US20070288518A1 (en) 2006-05-03 2007-12-13 Jeff Crigler System and method for collecting and distributing content
US20080059211A1 (en) * 2006-08-29 2008-03-06 Attributor Corporation Content monitoring and compliance
US20080059426A1 (en) * 2006-08-29 2008-03-06 Attributor Corporation Content monitoring and compliance enforcement
US20080059536A1 (en) * 2006-08-29 2008-03-06 Attributor Corporation Content monitoring and host compliance evaluation
US20080059425A1 (en) * 2006-08-29 2008-03-06 Attributor Corporation Compliance information retrieval
US20080059461A1 (en) * 2006-08-29 2008-03-06 Attributor Corporation Content search using a provided interface
US7343025B2 (en) * 2003-10-02 2008-03-11 Electronics And Telecommunications Research Institute Method for embedding and extracting digital watermark on lowest wavelet subband
US7366787B2 (en) 2001-06-08 2008-04-29 Sun Microsystems, Inc. Dynamic configuration of a content publisher
US7370017B1 (en) 2002-12-20 2008-05-06 Microsoft Corporation Redistribution of rights-managed content and technique for encouraging same
US20080163288A1 (en) * 2007-01-03 2008-07-03 At&T Knowledge Ventures, Lp System and method of managing protected video content
US20080178302A1 (en) * 2007-01-19 2008-07-24 Attributor Corporation Determination of originality of content
US20080178288A1 (en) 2007-01-24 2008-07-24 Secure Computing Corporation Detecting Image Spam
US7415127B2 (en) * 2002-10-31 2008-08-19 France Telecom System and method of watermarking a video signal and extracting the watermarking from a video signal
US20090013414A1 (en) * 2007-07-02 2009-01-08 Ripcode, Inc. System and Method for Monitoring Content
US20090125310A1 (en) * 2006-06-21 2009-05-14 Seungjae Lee Apparatus and method for inserting/extracting capturing resistant audio watermark based on discrete wavelet transform, audio rights protection system using the same
US20090165031A1 (en) * 2007-12-19 2009-06-25 At&T Knowledge Ventures, L.P. Systems and Methods to Identify Target Video Content
US20090324199A1 (en) 2006-06-20 2009-12-31 Koninklijke Philips Electronics N.V. Generating fingerprints of video signals
US20090327334A1 (en) * 2008-06-30 2009-12-31 Rodriguez Arturo A Generating Measures of Video Sequences to Detect Unauthorized Use
US20090328125A1 (en) * 2008-06-30 2009-12-31 Gits Peter M Video fingerprint systems and methods
US20090328237A1 (en) * 2008-06-30 2009-12-31 Rodriguez Arturo A Matching of Unknown Video Content To Protected Video Content
US7653552B2 (en) 2001-03-21 2010-01-26 Qurio Holdings, Inc. Digital file marketplace
US7702127B2 (en) * 2005-10-21 2010-04-20 Microsoft Corporation Video fingerprinting using complexity-regularized video watermarking by statistics quantization
US20100119105A1 (en) * 2008-10-28 2010-05-13 Koichi Moriya Image processing device and image processing progam
US20100182401A1 (en) * 2007-06-18 2010-07-22 Young-Suk Yoon System and method for managing digital videos using video features
US7817861B2 (en) 2006-11-03 2010-10-19 Symantec Corporation Detection of image spam
US7882177B2 (en) 2007-08-06 2011-02-01 Yahoo! Inc. Employing pixel density to detect a spam image
US7903868B2 (en) * 2006-07-24 2011-03-08 Samsung Electronics Co. Ltd. Video fingerprinting apparatus in frequency domain and method using the same
US8019742B1 (en) 2007-05-31 2011-09-13 Google Inc. Identifying related queries

Family Cites Families (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07299053A (en) * 1994-04-29 1995-11-14 Arch Dev Corp Computer diagnosis support method
US5987094A (en) * 1996-10-30 1999-11-16 University Of South Florida Computer-assisted method and apparatus for the detection of lung nodules
US6778709B1 (en) * 1999-03-12 2004-08-17 Hewlett-Packard Development Company, L.P. Embedded block coding with optimized truncation
KR20020026175A (en) * 2000-04-04 2002-04-06 요트.게.아. 롤페즈 Video encoding method using a wavelet transform
US6760536B1 (en) 2000-05-16 2004-07-06 International Business Machines Corporation Fast video playback with automatic content based variable speed
EP1297709A1 (en) * 2000-06-14 2003-04-02 Koninklijke Philips Electronics N.V. Color video encoding and decoding method
US7277468B2 (en) * 2000-09-11 2007-10-02 Digimarc Corporation Measuring quality of service of broadcast multimedia signals using digital watermark analyses
KR100426305B1 (en) * 2001-11-27 2004-04-08 한국전자통신연구원 Apparatus and method for embedding and extracting digital water mark using blind mode based on wavelet
JP2003242281A (en) 2002-02-19 2003-08-29 Sony Corp Use right control system, use right control device, method for controlling use right, programs therefor, and program recording media
US7366909B2 (en) * 2002-04-29 2008-04-29 The Boeing Company Dynamic wavelet feature-based watermark
US7249060B2 (en) 2002-08-12 2007-07-24 Paybyclick Corporation Systems and methods for distributing on-line content
JP3982686B2 (en) * 2002-11-21 2007-09-26 株式会社リコー Code generation apparatus, code generation program, and storage medium
JP4111268B2 (en) * 2002-12-13 2008-07-02 株式会社リコー Thumbnail image display method, server computer, client computer, and program
US20050195975A1 (en) * 2003-01-21 2005-09-08 Kevin Kawakita Digital media distribution cryptography using media ticket smart cards
US20060167881A1 (en) 2003-02-25 2006-07-27 Ali Aydar Digital media file identification
US7397936B2 (en) * 2003-08-13 2008-07-08 Siemens Medical Solutions Usa, Inc. Method and system for wavelet based detection of colon polyps
FR2870376B1 (en) * 2004-05-11 2006-09-22 Yann Boutant METHOD FOR RECOGNIZING FIBROUS MEDIA, AND APPLICATIONS OF SUCH A METHOD IN THE COMPUTER FIELD, IN PARTICULAR
KR100697516B1 (en) * 2004-10-27 2007-03-20 엘지전자 주식회사 Moving picture coding method based on 3D wavelet transformation
US20110258049A1 (en) 2005-09-14 2011-10-20 Jorey Ramer Integrated Advertising System
US20070106551A1 (en) 2005-09-20 2007-05-10 Mcgucken Elliot 22nets: method, system, and apparatus for building content and talent marketplaces and archives based on a social network
US7742619B2 (en) * 2005-12-21 2010-06-22 Texas Instruments Incorporated Image watermarking based on sequency and wavelet transforms
JP4398943B2 (en) * 2006-01-20 2010-01-13 株式会社東芝 Digital watermark detection apparatus, digital watermark detection method, and digital watermark detection program
GB2450023B (en) * 2006-03-03 2011-06-08 Honeywell Int Inc An iris image encoding method
US8037506B2 (en) 2006-03-03 2011-10-11 Verimatrix, Inc. Movie studio-based network distribution system and method
US8085995B2 (en) * 2006-12-01 2011-12-27 Google Inc. Identifying images using face recognition
US8050454B2 (en) 2006-12-29 2011-11-01 Intel Corporation Processing digital video using trajectory extraction and spatiotemporal decomposition
US8094872B1 (en) * 2007-05-09 2012-01-10 Google Inc. Three-dimensional wavelet based video fingerprinting
US20120002008A1 (en) * 2010-07-04 2012-01-05 David Valin Apparatus for secure recording and transformation of images to light for identification, and audio visual projection to spatial point targeted area
US20110153362A1 (en) * 2009-12-17 2011-06-23 Valin David A Method and mechanism for identifying protecting, requesting, assisting and managing information

Patent Citations (77)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5241281A (en) * 1990-03-19 1993-08-31 Capetronic Group Ltd. Microprocessor controlled monitor
US5600373A (en) * 1994-01-14 1997-02-04 Houston Advanced Research Center Method and apparatus for video image compression and decompression using boundary-spline-wavelets
US5634012A (en) 1994-11-23 1997-05-27 Xerox Corporation System for controlling the distribution and use of digital works having a fee reporting mechanism
US5729662A (en) * 1995-06-07 1998-03-17 Rozmus; J. Michael Neural network for classification of patterns with improved method and apparatus for ordering vectors
US20050008190A1 (en) * 1995-07-27 2005-01-13 Levy Kenneth L. Digital watermarking systems and methods
US20040170335A1 (en) * 1995-09-14 2004-09-02 Pearlman William Abraham N-dimensional data compression using set partitioning in hierarchical trees
US5664018A (en) 1996-03-12 1997-09-02 Leighton; Frank Thomson Watermarking process resilient to collusion attacks
US6226387B1 (en) * 1996-08-30 2001-05-01 Regents Of The University Of Minnesota Method and apparatus for scene-based video watermarking
US6005643A (en) * 1996-10-15 1999-12-21 International Business Machines Corporation Data hiding and extraction methods
US20080065896A1 (en) * 1997-02-20 2008-03-13 Andrew Johnson Digital Watermark Systems and Methods
US20070230739A1 (en) * 1997-02-20 2007-10-04 Andrew Johnson Digital watermark systems and methods
US20080130944A1 (en) * 1997-02-20 2008-06-05 Andrew Johnson Digital Watermark Systems and Methods
US6768518B1 (en) * 1998-12-16 2004-07-27 Xerox Corporation Method and apparatus for removing a checkerboard-like noise artifact from a captured composite NTSC video frame
US7185200B1 (en) * 1999-09-02 2007-02-27 Microsoft Corporation Server-side watermark data writing method and apparatus for digital signals
US6976165B1 (en) 1999-09-07 2005-12-13 Emc Corporation System and method for secure storage, transfer and retrieval of content addressable information
US6907075B2 (en) * 2000-06-30 2005-06-14 Koninklijke Philips Electronics N.V. Encoding method for the compression of a video sequence
US20020150164A1 (en) * 2000-06-30 2002-10-17 Boris Felts Encoding method for the compression of a video sequence
US20040028138A1 (en) * 2000-10-24 2004-02-12 Christopher Piche Three-dimensional wavelet-based scalable video compression
US7043473B1 (en) * 2000-11-22 2006-05-09 Widevine Technologies, Inc. Media tracking system and method
US6407680B1 (en) 2000-12-22 2002-06-18 Generic Media, Inc. Distributed on-demand media transcoding system and method
US7653552B2 (en) 2001-03-21 2010-01-26 Qurio Holdings, Inc. Digital file marketplace
US20020165819A1 (en) 2001-05-02 2002-11-07 Gateway, Inc. System and method for providing distributed computing services
US7366787B2 (en) 2001-06-08 2008-04-29 Sun Microsystems, Inc. Dynamic configuration of a content publisher
US7039215B2 (en) * 2001-07-18 2006-05-02 Oki Electric Industry Co., Ltd. Watermark information embedment device and watermark information detection device
US20030061490A1 (en) 2001-09-26 2003-03-27 Abajian Aram Christian Method for identifying copyright infringement violations by fingerprint detection
US7046855B2 (en) * 2001-10-29 2006-05-16 Parthusceva Ltd. Method and apparatus for performing spatial-to-frequency domain transform
US20030123584A1 (en) * 2001-12-28 2003-07-03 Siegel Erwin Frederick Trace video filtering using wavelet de-noising techniques
US6871200B2 (en) 2002-07-11 2005-03-22 Forensic Eye Ltd. Registration and monitoring system
US7415127B2 (en) * 2002-10-31 2008-08-19 France Telecom System and method of watermarking a video signal and extracting the watermarking from a video signal
US20060114998A1 (en) * 2002-12-04 2006-06-01 Eric Barrau Video coding method and device
US7370017B1 (en) 2002-12-20 2008-05-06 Microsoft Corporation Redistribution of rights-managed content and technique for encouraging same
US7738704B2 (en) 2003-03-07 2010-06-15 Technology, Patents And Licensing, Inc. Detecting known video entities utilizing fingerprints
US20050172312A1 (en) 2003-03-07 2005-08-04 Lienhart Rainer W. Detecting known video entities utilizing fingerprints
US20060187358A1 (en) * 2003-03-07 2006-08-24 Lienhart Rainer W Video entity recognition in compressed digital video streams
US7343025B2 (en) * 2003-10-02 2008-03-11 Electronics And Telecommunications Research Institute Method for embedding and extracting digital watermark on lowest wavelet subband
US20050125845A1 (en) * 2003-12-08 2005-06-09 Hardt Charles R. Set-top software mechanism for insertion of a unique non-intrusive digital signature into video program content
US20050154892A1 (en) * 2004-01-09 2005-07-14 Mihcak Mehmet K. Systems and methods for embedding media forensic identification markings
US20060098872A1 (en) 2004-03-22 2006-05-11 Stereo Display, Inc. Three-dimensional imaging system for pattern recognition
US7212330B2 (en) 2004-03-22 2007-05-01 Angstrom, Inc. Three-dimensional imaging system for pattern recognition
US20050213826A1 (en) * 2004-03-25 2005-09-29 Intel Corporation Fingerprinting digital video for rights management in networks
US20060085816A1 (en) 2004-10-18 2006-04-20 Funk James M Method and apparatus to control playback in a download-and-view video on demand system
US20060120558A1 (en) * 2004-10-20 2006-06-08 Yun-Qing Shi System and method for lossless data hiding using the integer wavelet transform
US20060110005A1 (en) * 2004-11-01 2006-05-25 Sony United Kingdom Limited Encoding apparatus and method
US20070180537A1 (en) 2005-01-07 2007-08-02 Shan He Method for fingerprinting multimedia content
US20060195860A1 (en) 2005-02-25 2006-08-31 Eldering Charles A Acting on known video entities detected utilizing fingerprinting
US20060195859A1 (en) 2005-02-25 2006-08-31 Richard Konig Detecting known video entities taking into account regions of disinterest
US20060271947A1 (en) 2005-05-23 2006-11-30 Lienhart Rainer W Creating fingerprints
US20070005556A1 (en) 2005-06-30 2007-01-04 Microsoft Corporation Probabilistic techniques for detecting duplicate tuples
US20070047816A1 (en) 2005-08-23 2007-03-01 Jamey Graham User Interface for Mixed Media Reality
US7702127B2 (en) * 2005-10-21 2010-04-20 Microsoft Corporation Video fingerprinting using complexity-regularized video watermarking by statistics quantization
US20070124698A1 (en) 2005-11-15 2007-05-31 Microsoft Corporation Fast collaborative filtering through approximations
US20070143778A1 (en) 2005-11-29 2007-06-21 Google Inc. Determining Popularity Ratings Using Social and Interactive Applications for Mass Media
US20070124756A1 (en) 2005-11-29 2007-05-31 Google Inc. Detecting Repeating Content in Broadcast Media
US20070130580A1 (en) 2005-11-29 2007-06-07 Google Inc. Social and Interactive Applications for Mass Media
US20070156726A1 (en) 2005-12-21 2007-07-05 Levy Kenneth L Content Metadata Directory Services
US20070288518A1 (en) 2006-05-03 2007-12-13 Jeff Crigler System and method for collecting and distributing content
US20090324199A1 (en) 2006-06-20 2009-12-31 Koninklijke Philips Electronics N.V. Generating fingerprints of video signals
US20090125310A1 (en) * 2006-06-21 2009-05-14 Seungjae Lee Apparatus and method for inserting/extracting capturing resistant audio watermark based on discrete wavelet transform, audio rights protection system using the same
US7903868B2 (en) * 2006-07-24 2011-03-08 Samsung Electronics Co. Ltd. Video fingerprinting apparatus in frequency domain and method using the same
US20080059211A1 (en) * 2006-08-29 2008-03-06 Attributor Corporation Content monitoring and compliance
US20080059536A1 (en) * 2006-08-29 2008-03-06 Attributor Corporation Content monitoring and host compliance evaluation
US20080059426A1 (en) * 2006-08-29 2008-03-06 Attributor Corporation Content monitoring and compliance enforcement
US20080059461A1 (en) * 2006-08-29 2008-03-06 Attributor Corporation Content search using a provided interface
US20080059425A1 (en) * 2006-08-29 2008-03-06 Attributor Corporation Compliance information retrieval
US7817861B2 (en) 2006-11-03 2010-10-19 Symantec Corporation Detection of image spam
US20080163288A1 (en) * 2007-01-03 2008-07-03 At&T Knowledge Ventures, Lp System and method of managing protected video content
US20080178302A1 (en) * 2007-01-19 2008-07-24 Attributor Corporation Determination of originality of content
US20080178288A1 (en) 2007-01-24 2008-07-24 Secure Computing Corporation Detecting Image Spam
US8019742B1 (en) 2007-05-31 2011-09-13 Google Inc. Identifying related queries
US20100182401A1 (en) * 2007-06-18 2010-07-22 Young-Suk Yoon System and method for managing digital videos using video features
US20090013414A1 (en) * 2007-07-02 2009-01-08 Ripcode, Inc. System and Method for Monitoring Content
US7882177B2 (en) 2007-08-06 2011-02-01 Yahoo! Inc. Employing pixel density to detect a spam image
US20090165031A1 (en) * 2007-12-19 2009-06-25 At&T Knowledge Ventures, L.P. Systems and Methods to Identify Target Video Content
US20090328237A1 (en) * 2008-06-30 2009-12-31 Rodriguez Arturo A Matching of Unknown Video Content To Protected Video Content
US20090328125A1 (en) * 2008-06-30 2009-12-31 Gits Peter M Video fingerprint systems and methods
US20090327334A1 (en) * 2008-06-30 2009-12-31 Rodriguez Arturo A Generating Measures of Video Sequences to Detect Unauthorized Use
US20100119105A1 (en) * 2008-10-28 2010-05-13 Koichi Moriya Image processing device and image processing progam

Non-Patent Citations (16)

* Cited by examiner, † Cited by third party
Title
Ashwin Swaminathan et al., Robust and Secure Image Hashing, IEEE Transactions on Information Forensics and Security, Jun. 2006, pp. 215-230, vol. 1 , No. 2.
Charles E. Jacobs et al., Fast Multiresolution Image Querying, International Conference on Computer Graphics and Interactive Techniques, Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques, 1995, pp. 277-286, ACM, U.S.A.
Definition of "Publisher", The Penguin English Dictionary, 2007, Credo Reference, 1 page, [online] [retrieved on Jul. 31, 2010] Retrieved from the internet URL:http://www.xreferplus.com/entry/penguineng/publisher>.
Edith Cohen et al., Finding Interesting Associations without Support Pruning, IEEE Transactions on Knowledge and Data Engineering, 2001, pp. 64-78, vol. 13, Issue 1.
Gargi et al. "Solving the Label Resolution Problem with Supervised Video" MIR (2008) ACM pp. 1-7. *
Michele Covell et al., Known-Audio Detection Using Waveprint: Spectrogram Fingerprinting by Wavelet Hashing, International Conference on Acoustics, Speech and Signal Processing (ICASSP-2007), 2007.
Ondrej Chum et al., Scalable Near Identical Image and Shot Detection, Conference on Image and Audio Video Retrieval, Proceedings of the 6th ACM International Conference on Image and Video Retrieval, 2007, pp. 549-556, ACM. N.Y., USA.
Paris et al. "Low Bit Rate Software Only Wavelet Video Coding" IEEE (1997) pp. 1-6. *
Pierre Moulin et al., Performance of Random Fingerprinting Codes Under Arbitrary Nonlinear Attacks, IEEE International Conference on Acoustics Speech and Signal Processing, Apr. 2007, pp. II-157-II-160, vol. 2, Issue 15-20.
Sarkar, A., et al. "Video Fingerprinting: Features for Duplicate and Similar Video Detection and Query-based Video Retrieval" CitSeerX, Jan. 2008, vol. 6820, pp. 1-12.
Shumeet Baluja et al., Audio Fingerprinting: Combining Computer Vision & Data Stream Processing, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007), Apr. 15-20, 2007, pp. II-213-II-216, vol. 2.
Shumeet Baluja et al., Content Fingerprinting Using Wavelets, 3rd European Conference on Visual Media Production, 2006, pp. 198-207.
Ting Liu et al., Clustering Billions of Images with Large Scale Nearest Neighbor Search, 8th IEEE Workshop on Application of Computer Vision (WACV'07). Feb. 2007, pp. 28-34, U.S.A.
Toderici et al. "Automatic, Efficient, Temporally Coherent Video Enhancement for Large Scale Applications" MM Oct. 19-24, 2009 pp. 1-4. *
Yagnik et al. "A Model Based Factorization Approach for dense 3D recovery from Monocular Video" Proc. of the 7th IEEE ISM (2005) pp. 1-4. *
Yagnik et al. "Learning People Annotation from the Web via Consistency Learning" MIR (2007) ACM pp. 1-6. *

Cited By (157)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9336367B2 (en) 2006-11-03 2016-05-10 Google Inc. Site directed management of audio components of uploaded video files
US20080109369A1 (en) * 2006-11-03 2008-05-08 Yi-Ling Su Content Management System
US10643249B2 (en) 2007-05-03 2020-05-05 Google Llc Categorizing digital content providers
US20080275763A1 (en) * 2007-05-03 2008-11-06 Thai Tran Monetization of Digital Content Contributions
US8924270B2 (en) 2007-05-03 2014-12-30 Google Inc. Monetization of digital content contributions
US8611689B1 (en) * 2007-05-09 2013-12-17 Google Inc. Three-dimensional wavelet based video fingerprinting
US8340449B1 (en) * 2007-05-09 2012-12-25 Google Inc. Three-dimensional wavelet based video fingerprinting
US11803591B2 (en) 2007-06-18 2023-10-31 Roku, Inc. Method and apparatus for multi-dimensional content search and video identification
US11288312B2 (en) 2007-06-18 2022-03-29 Roku, Inc. Method and apparatus for multi-dimensional content search and video identification
US10977307B2 (en) 2007-06-18 2021-04-13 Gracenote, Inc. Method and apparatus for multi-dimensional content search and video identification
US11126654B1 (en) * 2007-06-18 2021-09-21 Roku, Inc. Method and apparatus for multi-dimensional content search and video identification
US9135674B1 (en) 2007-06-19 2015-09-15 Google Inc. Endpoint based video fingerprinting
US20090013189A1 (en) * 2007-06-28 2009-01-08 Michel Morvan Method and devices for video processing rights enforcement
US8453248B2 (en) * 2007-06-28 2013-05-28 Thomson Licensing Method and devices for video processing rights enforcement
US8447032B1 (en) * 2007-08-22 2013-05-21 Google Inc. Generation of min-hash signatures
US8712216B1 (en) * 2008-02-22 2014-04-29 Google Inc. Selection of hash lookup keys for efficient retrieval
US20090238465A1 (en) * 2008-03-18 2009-09-24 Electronics And Telecommunications Research Institute Apparatus and method for extracting features of video, and system and method for identifying videos using same
US8620107B2 (en) * 2008-03-18 2013-12-31 Electronics And Telecommunications Research Institute Apparatus and method for extracting features of video, and system and method for identifying videos using same
US20090276468A1 (en) * 2008-05-01 2009-11-05 Yahoo! Inc. Method for media fingerprinting
US8752185B2 (en) * 2008-05-01 2014-06-10 Yahoo! Inc. Method for media fingerprinting
US20090313249A1 (en) * 2008-06-11 2009-12-17 Bennett James D Creative work registry independent server
US8498487B2 (en) * 2008-08-20 2013-07-30 Sri International Content-based matching of videos using local spatio-temporal fingerprints
US20100049711A1 (en) * 2008-08-20 2010-02-25 Gajinder Singh Content-based matching of videos using local spatio-temporal fingerprints
US20100174930A1 (en) * 2009-01-05 2010-07-08 Samsung Electronics Co., Ltd. Mobile device having organic light emitting display and related display method for power saving
US10133956B2 (en) * 2009-03-13 2018-11-20 Nec Corporation Image signature extraction device
US20120027306A1 (en) * 2009-03-13 2012-02-02 Nec Corporation Image signature extraction device
US9280977B2 (en) 2009-05-21 2016-03-08 Digimarc Corporation Content recognition employing fingerprinting at different resolutions
US11272248B2 (en) 2009-05-29 2022-03-08 Inscape Data, Inc. Methods for identifying video segments and displaying contextually targeted content on a connected television
US10949458B2 (en) 2009-05-29 2021-03-16 Inscape Data, Inc. System and method for improving work load management in ACR television monitoring system
US10185768B2 (en) 2009-05-29 2019-01-22 Inscape Data, Inc. Systems and methods for addressing a media database using distance associative hashing
US10271098B2 (en) 2009-05-29 2019-04-23 Inscape Data, Inc. Methods for identifying video segments and displaying contextually targeted content on a connected television
US10169455B2 (en) 2009-05-29 2019-01-01 Inscape Data, Inc. Systems and methods for addressing a media database using distance associative hashing
US11080331B2 (en) 2009-05-29 2021-08-03 Inscape Data, Inc. Systems and methods for addressing a media database using distance associative hashing
US9906834B2 (en) 2009-05-29 2018-02-27 Inscape Data, Inc. Methods for identifying video segments and displaying contextually targeted content on a connected television
US10820048B2 (en) 2009-05-29 2020-10-27 Inscape Data, Inc. Methods for identifying video segments and displaying contextually targeted content on a connected television
US9055335B2 (en) 2009-05-29 2015-06-09 Cognitive Networks, Inc. Systems and methods for addressing a media database using distance associative hashing
US10375451B2 (en) 2009-05-29 2019-08-06 Inscape Data, Inc. Detection of common media segments
US10116972B2 (en) 2009-05-29 2018-10-30 Inscape Data, Inc. Methods for identifying video segments and displaying option to view from an alternative source and/or on an alternative device
US20110004944A1 (en) * 2009-06-24 2011-01-06 Tvu Networks Corporation Methods and systems for fingerprint-based copyright protection of real-time content
US8464357B2 (en) * 2009-06-24 2013-06-11 Tvu Networks Corporation Methods and systems for fingerprint-based copyright protection of real-time content
US11653053B2 (en) 2009-09-14 2023-05-16 Tivo Solutions Inc. Multifunction multimedia device
US9369758B2 (en) 2009-09-14 2016-06-14 Tivo Inc. Multifunction multimedia device
US9648380B2 (en) 2009-09-14 2017-05-09 Tivo Solutions Inc. Multimedia device recording notification system
US10805670B2 (en) 2009-09-14 2020-10-13 Tivo Solutions, Inc. Multifunction multimedia device
US10097880B2 (en) 2009-09-14 2018-10-09 Tivo Solutions Inc. Multifunction multimedia device
US20130332951A1 (en) * 2009-09-14 2013-12-12 Tivo Inc. Multifunction multimedia device
US9521453B2 (en) 2009-09-14 2016-12-13 Tivo Inc. Multifunction multimedia device
US9554176B2 (en) * 2009-09-14 2017-01-24 Tivo Inc. Media content fingerprinting system
US9781377B2 (en) 2009-12-04 2017-10-03 Tivo Solutions Inc. Recording and playback system based on multimedia content fingerprints
US11409825B2 (en) 2009-12-18 2022-08-09 Graphika Technologies, Inc. Methods and systems for identifying markers of coordinated activity in social media movements
US10324598B2 (en) 2009-12-18 2019-06-18 Graphika, Inc. System and method for a search engine content filter
US20110235908A1 (en) * 2010-03-23 2011-09-29 Microsoft Corporation Partition min-hash for partial-duplicate image determination
US8452106B2 (en) * 2010-03-23 2013-05-28 Microsoft Corporation Partition min-hash for partial-duplicate image determination
US10192138B2 (en) 2010-05-27 2019-01-29 Inscape Data, Inc. Systems and methods for reducing data density in large datasets
US20110311095A1 (en) * 2010-06-18 2011-12-22 Verizon Patent And Licensing, Inc. Content fingerprinting
US9047516B2 (en) * 2010-06-18 2015-06-02 Verizon Patent And Licensing Inc. Content fingerprinting
US20120062793A1 (en) * 2010-09-15 2012-03-15 Verizon Patent And Licensing Inc. Synchronizing videos
US8928809B2 (en) * 2010-09-15 2015-01-06 Verizon Patent And Licensing Inc. Synchronizing videos
US9552442B2 (en) * 2010-10-21 2017-01-24 International Business Machines Corporation Visual meme tracking for social media analysis
US10303801B2 (en) * 2010-10-21 2019-05-28 International Business Machines Corporation Visual meme tracking for social media analysis
US20120102021A1 (en) * 2010-10-21 2012-04-26 International Business Machines Corporation Visual meme tracking for social media analysis
US20170109360A1 (en) * 2010-10-21 2017-04-20 International Business Machines Corporation Visual meme tracking for social media analysis
US8462984B2 (en) * 2011-03-03 2013-06-11 Cypher, Llc Data pattern recognition and separation engine
US20120224741A1 (en) * 2011-03-03 2012-09-06 Edwards Tyson Lavar Data pattern recognition and separation engine
US9087260B1 (en) * 2012-01-03 2015-07-21 Google Inc. Hierarchical randomized quantization of multi-dimensional features
US8660296B1 (en) * 2012-01-10 2014-02-25 Google Inc. Systems and methods for facilitating video fingerprinting using local descriptors
US9177350B1 (en) 2012-01-10 2015-11-03 Google Inc. Systems and methods for facilitating video fingerprinting using local descriptors
US8953836B1 (en) * 2012-01-31 2015-02-10 Google Inc. Real-time duplicate detection for uploaded videos
US8842920B1 (en) 2012-03-29 2014-09-23 Google Inc. Systems and methods for facilitating crop-invariant video fingerprinting using wavelet transforms
US8995771B2 (en) 2012-04-30 2015-03-31 Microsoft Technology Licensing, Llc Identification of duplicates within an image space
US9113203B2 (en) * 2012-06-28 2015-08-18 Google Inc. Generating a sequence of audio fingerprints at a set top box
US20140002749A1 (en) * 2012-06-28 2014-01-02 Mihai Pora Generating a Sequence of Audio Fingerprints at a Set Top Box
US8843952B2 (en) 2012-06-28 2014-09-23 Google Inc. Determining TV program information based on analysis of audio fingerprints
US11677995B2 (en) 2012-09-19 2023-06-13 Google Llc Systems and methods for live media content matching
US9661361B2 (en) 2012-09-19 2017-05-23 Google Inc. Systems and methods for live media content matching
US11064227B2 (en) 2012-09-19 2021-07-13 Google Llc Systems and methods for live media content matching
US10536733B2 (en) 2012-09-19 2020-01-14 Google Llc Systems and methods for live media content matching
US9146990B2 (en) * 2013-01-07 2015-09-29 Gracenote, Inc. Search and identification of video content
US9959345B2 (en) * 2013-01-07 2018-05-01 Gracenote, Inc. Search and identification of video content
US20140193027A1 (en) * 2013-01-07 2014-07-10 Steven D. Scherf Search and identification of video content
US20150356178A1 (en) * 2013-01-07 2015-12-10 Gracenote, Inc. Search and identification of video content
US9613101B2 (en) 2013-03-15 2017-04-04 Google Inc. Promoting an original version of a copyrighted media item over an authorized copied version of the copyrighted media item in a search query
WO2014152567A1 (en) * 2013-03-15 2014-09-25 Google Inc. Promoting an original version of a copyrighted media item over an authorized copied version of the copyrighted media item in a search query
WO2014145929A1 (en) * 2013-03-15 2014-09-18 Zeev Neumeier Systems and methods for addressing a media database using distance associative hashing
EP3001871A4 (en) * 2013-03-15 2017-02-22 Cognitive Media Networks, Inc. Systems and methods for addressing a media database using distance associative hashing
CN105684384A (en) * 2013-07-18 2016-06-15 谷歌公司 Generating and providing an authorization indication in relation to a media content item
US20150026078A1 (en) * 2013-07-18 2015-01-22 Google Inc. Generating and providing an authorization indication in relation to a media content item
WO2015010095A1 (en) * 2013-07-18 2015-01-22 Google Inc. Generating and providing an authorization indication in relation to a media content item
CN105684384B (en) * 2013-07-18 2019-08-30 谷歌有限责任公司 Authorization instruction is generated and provided about items of media content
KR101802100B1 (en) 2013-07-18 2017-11-27 구글 엘엘씨 Generating and providing an authorization indication in relation to a media content item
US10977298B2 (en) 2013-11-08 2021-04-13 Friend for Media Limited Identifying media components
US11500916B2 (en) 2013-11-08 2022-11-15 Friend for Media Limited Identifying media components
US20150143416A1 (en) * 2013-11-21 2015-05-21 Thomson Licensing Method and apparatus for matching of corresponding frames in multimedia streams
US9584844B2 (en) * 2013-11-21 2017-02-28 Thomson Licensing Sas Method and apparatus for matching of corresponding frames in multimedia streams
US9955192B2 (en) 2013-12-23 2018-04-24 Inscape Data, Inc. Monitoring individual viewing of television events using tracking pixels and cookies
US9838753B2 (en) 2013-12-23 2017-12-05 Inscape Data, Inc. Monitoring individual viewing of television events using tracking pixels and cookies
US11039178B2 (en) 2013-12-23 2021-06-15 Inscape Data, Inc. Monitoring individual viewing of television events using tracking pixels and cookies
US10306274B2 (en) 2013-12-23 2019-05-28 Inscape Data, Inc. Monitoring individual viewing of television events using tracking pixels and cookies
US10284884B2 (en) 2013-12-23 2019-05-07 Inscape Data, Inc. Monitoring individual viewing of television events using tracking pixels and cookies
US10536454B2 (en) * 2013-12-31 2020-01-14 Veridium Ip Limited System and method for biometric protocol standards
US20180176216A1 (en) * 2013-12-31 2018-06-21 Veridium Ip Limited System and method for biometric protocol standards
US10198441B1 (en) 2014-01-14 2019-02-05 Google Llc Real-time duplicate detection of videos in a massive video sharing system
US10956945B1 (en) 2014-02-24 2021-03-23 Google Llc Applying social interaction-based policies to digital media content
US20170127095A1 (en) * 2014-06-13 2017-05-04 Samsung Electronics Co., Ltd. Method and device for managing multimedia data
US10645425B2 (en) * 2014-06-13 2020-05-05 Samsung Electronics Co., Ltd. Method and device for managing multimedia data
GB2527528B (en) * 2014-06-24 2021-09-29 Grass Valley Ltd Hash-based media search
US9838494B1 (en) * 2014-06-24 2017-12-05 Amazon Technologies, Inc. Reducing retrieval times for compressed objects
US9905233B1 (en) 2014-08-07 2018-02-27 Digimarc Corporation Methods and apparatus for facilitating ambient content recognition using digital watermarks, and related arrangements
US10405014B2 (en) 2015-01-30 2019-09-03 Inscape Data, Inc. Methods for identifying video segments and displaying option to view from an alternative source and/or on an alternative device
US10945006B2 (en) 2015-01-30 2021-03-09 Inscape Data, Inc. Methods for identifying video segments and displaying option to view from an alternative source and/or on an alternative device
US11711554B2 (en) 2015-01-30 2023-07-25 Inscape Data, Inc. Methods for identifying video segments and displaying option to view from an alternative source and/or on an alternative device
US9813784B1 (en) * 2015-03-25 2017-11-07 A9.com Expanded previously on segments
US10469918B1 (en) * 2015-03-25 2019-11-05 A9.Com, Inc. Expanded previously on segments
US10482349B2 (en) 2015-04-17 2019-11-19 Inscape Data, Inc. Systems and methods for reducing data density in large datasets
US20160316261A1 (en) * 2015-04-23 2016-10-27 Sorenson Media, Inc. Automatic content recognition fingerprint sequence matching
US11659255B2 (en) 2015-07-16 2023-05-23 Inscape Data, Inc. Detection of common media segments
US10080062B2 (en) 2015-07-16 2018-09-18 Inscape Data, Inc. Optimizing media fingerprint retention to improve system resource utilization
US11308144B2 (en) 2015-07-16 2022-04-19 Inscape Data, Inc. Systems and methods for partitioning search indexes for improved efficiency in identifying media segments
US11451877B2 (en) 2015-07-16 2022-09-20 Inscape Data, Inc. Optimizing media fingerprint retention to improve system resource utilization
US10674223B2 (en) 2015-07-16 2020-06-02 Inscape Data, Inc. Optimizing media fingerprint retention to improve system resource utilization
US10873788B2 (en) 2015-07-16 2020-12-22 Inscape Data, Inc. Detection of common media segments
US10902048B2 (en) 2015-07-16 2021-01-26 Inscape Data, Inc. Prediction of future views of video segments to optimize system resource utilization
US11329980B2 (en) 2015-08-21 2022-05-10 Veridium Ip Limited System and method for biometric protocol standards
US9723344B1 (en) * 2015-12-29 2017-08-01 Google Inc. Early detection of policy violating media
US20170357875A1 (en) * 2016-06-08 2017-12-14 International Business Machines Corporation Detecting usage of copyrighted video content using object recognition
US11301714B2 (en) 2016-06-08 2022-04-12 International Business Machines Corporation Detecting usage of copyrighted video content using object recognition
US9996769B2 (en) * 2016-06-08 2018-06-12 International Business Machines Corporation Detecting usage of copyrighted video content using object recognition
US10579899B2 (en) 2016-06-08 2020-03-03 International Business Machines Corporation Detecting usage of copyrighted video content using object recognition
WO2018004718A1 (en) * 2016-06-27 2018-01-04 Facebook, Inc. Systems and methods for identifying matching content
US20170371963A1 (en) * 2016-06-27 2017-12-28 Facebook, Inc. Systems and methods for identifying matching content
US10650241B2 (en) 2016-06-27 2020-05-12 Facebook, Inc. Systems and methods for identifying matching content
EP3264323A1 (en) * 2016-06-27 2018-01-03 Facebook, Inc. Systems and methods for identifying matching content
US11030462B2 (en) 2016-06-27 2021-06-08 Facebook, Inc. Systems and methods for storing content
US20180101540A1 (en) * 2016-10-10 2018-04-12 Facebook, Inc. Diversifying Media Search Results on Online Social Networks
WO2018089094A1 (en) * 2016-11-11 2018-05-17 Google Llc Differential scoring: a high-precision scoring method for video matching
US10061987B2 (en) 2016-11-11 2018-08-28 Google Llc Differential scoring: a high-precision scoring method for video matching
US10983984B2 (en) 2017-04-06 2021-04-20 Inscape Data, Inc. Systems and methods for improving accuracy of device maps using media viewing data
CN108875315A (en) * 2017-05-10 2018-11-23 谷歌有限责任公司 Fingerprint is converted to detect the method, system and medium of unauthorized items of media content
US9936230B1 (en) * 2017-05-10 2018-04-03 Google Llc Methods, systems, and media for transforming fingerprints to detect unauthorized media content items
US20180332319A1 (en) * 2017-05-10 2018-11-15 Google Llc Methods, systems, and media for transforming fingerprints to detect unauthorized media content items
US10536729B2 (en) * 2017-05-10 2020-01-14 Google Llc Methods, systems, and media for transforming fingerprints to detect unauthorized media content items
EP3404926A1 (en) * 2017-05-17 2018-11-21 Snell Advanced Media Limited Generation of visual hash
US10796158B2 (en) 2017-05-17 2020-10-06 Grass Valley Limited Generation of video hash
US11341747B2 (en) 2017-05-17 2022-05-24 Grass Valley Limited Generation of video hash
US11574643B2 (en) * 2017-06-08 2023-02-07 The Nielsen Company (Us), Llc Methods and apparatus for audio signature generation and matching
US10872614B2 (en) 2017-06-08 2020-12-22 The Nielsen Company (Us), Llc Methods and apparatus for audio signature generation and matching
US10236005B2 (en) * 2017-06-08 2019-03-19 The Nielsen Company (Us), Llc Methods and apparatus for audio signature generation and matching
US20210249023A1 (en) * 2017-06-08 2021-08-12 The Nielsen Company (Us), Llc Methods and apparatus for audio signature generation and matching
US10911815B1 (en) 2017-09-25 2021-02-02 Amazon Technologies, Inc. Personalized recap clips
US10555023B1 (en) 2017-09-25 2020-02-04 Amazon Technologies, Inc. Personalized recap clips
US10713495B2 (en) 2018-03-13 2020-07-14 Adobe Inc. Video signatures based on image feature extraction
CN109255777B (en) * 2018-07-27 2021-10-22 昆明理工大学 Image similarity calculation method combining wavelet transformation and perceptual hash algorithm
CN109255777A (en) * 2018-07-27 2019-01-22 昆明理工大学 A kind of image similarity calculation method of combination wavelet transformation and perceptual hash algorithm
US11356733B2 (en) * 2018-11-05 2022-06-07 The Nielsen Company (Us), Llc Methods and apparatus to generate reference signatures
US11716510B2 (en) 2018-11-05 2023-08-01 The Nielsen Company (Us), Llc Methods and apparatus to generate reference signatures
US10911824B2 (en) * 2018-11-05 2021-02-02 The Nielsen Company (Us), Llc Methods and apparatus to generate reference signatures
US20210352341A1 (en) * 2020-05-06 2021-11-11 At&T Intellectual Property I, L.P. Scene cut-based time alignment of video streams

Also Published As

Publication number Publication date
US8340449B1 (en) 2012-12-25
US8611689B1 (en) 2013-12-17

Similar Documents

Publication Publication Date Title
US8094872B1 (en) Three-dimensional wavelet based video fingerprinting
US8611422B1 (en) Endpoint based video fingerprinting
US8655103B2 (en) Deriving an image representation using frequency components of a frequency representation
Oostveen et al. Feature extraction and a database strategy for video fingerprinting
US8947595B1 (en) Fingerprinting to match videos having varying aspect ratios
US8830331B2 (en) Method and apparatus for detecting near-duplicate videos using perceptual video signatures
TWI528196B (en) Similar image recognition method and apparatus
US8655056B2 (en) Content-based matching of videos using local spatio-temporal fingerprints
Galvan et al. First quantization matrix estimation from double compressed JPEG images
JP4771906B2 (en) Method for classifying images with respect to JPEG compression history
US8392427B1 (en) LSH-based retrieval using sub-sampling
US8184953B1 (en) Selection of hash lookup keys for efficient retrieval
US20080159403A1 (en) System for Use of Complexity of Audio, Image and Video as Perceived by a Human Observer
Sarkar et al. Video fingerprinting: features for duplicate and similar video detection and query-based video retrieval
WO2019195835A1 (en) Comparing frame data to generate a textless version of a multimedia production
KR101634395B1 (en) Video identification
Nie et al. Robust video hashing based on representative-dispersive frames
Bracamonte et al. Efficient compressed domain target image search and retrieval
Yannikos et al. Automating video file carving and content identification
Raju et al. Video copy detection in distributed environment
Mahdian et al. Image tampering detection using methods based on JPEG compression artifacts: a real-life experiment
Chaisorn et al. A fast and efficient framework for indexing and detection of modified copies in video
Dixit et al. DyWT based copy-move forgery detection with improved detection accuracy
Ayalneh et al. Early width estimation of fragmented JPEG with corrupted header
Fatourechi et al. Image and Video Copy Detection Using Content-Based Fingerprinting

Legal Events

Date Code Title Description
AS Assignment

Owner name: GOOGLE INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:YAGNIK, JAY;ROWLEY, HENRY A.;IOFFE, SERGEY;SIGNING DATES FROM 20070507 TO 20070508;REEL/FRAME:019271/0297

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

AS Assignment

Owner name: GOOGLE LLC, CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNOR:GOOGLE INC.;REEL/FRAME:044101/0405

Effective date: 20170929

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 12