Add Quanto4,2, HQQ4,2 KV cache quantization support to Transformers loader #6768
                        
                     
                 
             
            
                
                    
                        
                        
                            
                                
                                    
                                    
                                        
                                        
                                            
                                            
         Get quanto4,2 KV cache working in Transformers
    
                                            
                                            Get quanto4,2 KV cache working in Transformers
                                         
                                        
                                            4a55c742
                                        
                                     
                                 
                            
                        
                            
                                
                                    
                                    
                                        
                                        
                                            
                                            
         Add optimum-quanto to requirements
    
                                            
                                            Add optimum-quanto to requirements
                                         
                                        
                                            d0ea0509
                                        
                                     
                                 
                            
                        
                            
                                
                                    
                                    
                                        
                                        
                                            
                                            
         Add HQQ KV cache quantization for Transformers
    
                                            
                                            Add HQQ KV cache quantization for Transformers
                                         
                                        
                                            244d3cb6
                                        
                                     
                                 
                            
                        
                            
                                
    
    
        
        
            
            
         dinerburger
    
            
            
                                    
                                        changed the title
    
            
        dinerburger
    
            
            
                                    
                                        changed the title Add Quanto4,2 KV cache quantization support to Transformers loader Add Quanto4,2, HQQ4,2 KV cache quantization support to Transformers loader 245 days ago
                                    
                                
         
     
 
                            
                        
                            
                            
                                
                                
                                    
                                        
                                            
                                            
                                            
                                        
                                        
                                            
        cceneag
    
                                            
                                            
                                            requested changes
                                            
                                            on 2025-10-08
                                            
                                         
                                     
                                    
                                    
                                    
                                    
                                 
                             
                            
                        
                     
                    
                    
                    
                 
                
                    
                    
                        Assignees
                        
                        
                            No one assigned
                        
                        
                     
                    
                    
                 
             
         
        
        
     
 
     
Login to write a write a comment.
Login via GitHub